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Foreword 


General Relativity (GR) is founded on the observation of Mercury perihelion preces- 
sion anomaly discovered by Le Verrier and improved by Newcomb, the Michelson— 
Morley experiment, and the precision Edtv6s experiment. Its theoretical basis is 
based on Special Relativity (previously called the restricted theory of relativity), 
Einstein Equivalence Principle (EEP) and the realization that the metric is the 
dynamical quantity of gravity together with the Principle of General Covariance 
and the absence of other dynamical quantities. The establishment of GR in 1915 is 
a community effort with Albert Einstein clearly playing the dominant role. For one 
hundred years, its applicability through solar system to cosmology is prevailing. 
If one includes the cosmological constant (proposed in 1917 by Einstein) in GR, 
there have not been any fully established non-applicable places. The only possible 
potential exception is the missing mass (dark matter)-deficient acceleration issue. 
Dark energy and quantum gravity are needed in the present theoretical foundation 
of physics; however, more experimental clues are needed. The framework applica- 
bility of GR is already demonstrated in theoretical inflation models with quantum 
fluctuations leading to structure formation with experimentally observed spectrum. 

To celebrate the GR centennial, we solicit the writing of 23 chapters in these 
two volumes consisting of five parts: 


Part I. Genesis, Solutions and Energy. 
Part II. Empirical Foundations. 
Part II. Gravitational Waves. 
Part IV. Cosmology. 
Part V. Quantum Gravity. 


Volume 1 consists of Part I, Part II and Part III; Volume 2 consists of Part IV and 
Part V. 

In Part I, Valerie Messager and Christophe Letellier start in Chapter 1 with a 
genesis of special relativity to set the stage. They rely on the original literature to 
make the development clear and connected. Thanks to many thorough researches 
in the last 50 years, the path to general relativity is clear. A concise exposition of 
the path is presented in Chapter 2. In Chapter 3, Christian Heinicke and Friedrich 
Hehl present the historical development and detailed properties of the basic and 
fundamental spherical Schwarzschild and axisymmetric Kerr solutions. In Chap- 
ter 4, Chiang-Mei Chen, James Nester and Roh-Sung Tung expound the important 
and useful concept of energy with its many facets and various applications. 
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In Part II, the empirical foundations of GR are examined. First, the corner- 
stone Einstein Equivalence Principle (KEEP) is explored. Ever since about 100 ps 
(the time of electroweak phase transition or the equivalent /substitute) at the quasi- 
equilibrium Higgs/intermediate boson energy scale from the Big Bang (or equiva- 
lent /substitute), photons and charged particles are abundant. With the premetric 
formulation of electrodynamics, we examine the tests of EEP via the metric-induced 
spacetime constitutive tensor density. The non-birefringence of the cosmic electro- 
magnetic wave propagation in spacetime is observed to ultrahigh precision. This 
constrains the spacetime constitutive tensor density to Maxwell—Lorentz (metric) 
form plus a scalar (dilaton) degree of freedom and a pseudoscalar (axion) degree 
of freedom to high precision. The accurate agreement of cosmic microwave back- 
ground spectrum with the Planck spectrum constrains the fractional change of the 
cosmic dilaton to be less than 8 x 1074. The Galileo weak equivalence principle 
(WEP) experiments (Edtvés-type experiments) constrain the fractional dilatonic 
change in the solar system to be less than 10~!°. Accompanying the axion degree 
of freedom is the rotation of linear polarization in the cosmic propagation of elec- 
tromagnetic waves called cosmic polarization rotation (CPR). Sperello di Serego 
Alighieri reviews the constraints from radio galaxy observations and CMB polar- 
ization observations to give a general constraint of 0.02 rad for the mean (uniform) 
CPR and also a constraint of 0.02 rad for the CPR fluctuations. In many inflation 
models dilatons and axions play important roles; these investigations are crucial 
to give clues or constraints on the models. Frequency and time are the most pre- 
cise metrological quantities. Their uses in gravity experiments are unavoidable. The 
use of GR in time synchronization and in GPS, GLONNESS, Galileo and Beidou 
becomes a folk talk. There are two good ways to compare precision clocks: (i) fiber 
links; (ii) space optical links using laser ranging. Etienne Samain expounds the space 
optical link approach and addresses the laser ranging missions T2L2 (Time trans- 
fer by Laser link), LRO (Lunar Reconnaissance Orbiter) and LTT (Laser Time 
Transfer) together with future space mission proposals for fundamental physics, 
solar system science/navigation in which laser links are of prime importance. Solar- 
system observation provides the original impetus and the first confirmation of GR. 
Chapter 8 summarizes the progress of classical solar system tests and explores its 
potential in the future. Improvement of three or more orders of magnitude is still 
possible. 

Perhaps the most dramatic development in testing relativistic gravity and in 
improving the dynamical foundations of general relativity is the discovery and obser- 
vation of pulsars, binary pulsars, millisecond pulsars and double pulsars since 1967, 
1974, 1982 and 2003 respectively. Richard Manchester reviews the pulsar observa- 
tion in its relation with gravity in Chapter 9 with a brief introduction to basic pulsar 
properties and pulsar timing. He presents a rather thorough account of dynamical 
tests of GR and the strong equivalence principle together with a lucid but in-depth 
account of GW detection using pulsar timing arrays (PTAs). See front cover for an 
illustrative schematic of a PTA. 
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In 1916 Einstein predicted gravitational waves (GWs) in GR almost immedi- 
ately after his founding of it. The existence of gravitational waves is the direct 
consequence of general relativity and unavoidable consequences of all relativistic 
gravity theories with finite velocity of propagation. Their importance in GR is like 
that electromagnetic waves in Maxwell—Lorentz theory of electromagnetism. Ein- 
stein’s general relativity and relativistic gravity theories predict the existence of 
gravitational waves. Gravitational waves propagate in spacetime forming ripples 
of spacetime geometry. In the introductory chapter of Part HI, Kazuaki Kuroda, 
Wei-Ping Pan and I review and summarize the complete GW spectrum, the meth- 
ods of detection, and the detection sensitivities in various frequency bands with 
a brief introduction to GW sources. At the time Einstein predicted GWs in GR, 
he estimated that GWs were experimentally not detectable due to feeble strengths. 
However, thanks to one hundred years of development of experimental methods and 
technology together with the discovery of various astrophysical compact objects 
and cosmological sources, GWs are now on the verge of detection in three fre- 
quency bands. The very low frequency band (10fHz-300pHz) GWs are on the 
verge of detection by the PTAs; Richard Manchester covers this part in his chapter 
on pulsars and gravity in Part I]. As mentioned in Chapter 10, the observation of 
PTAs has already constrained the isotropic GW background to a level excluding 
most current models of supermassive black hole formation. This is a strong signal 
that PTA observation is on the verge of detecting GWs. The high frequency band 
(10 Hz-100kHz) GWs are on the verge of detection by ground-based interferome- 
ters; Kazuaki Kuroda addresses the detection methods and the sources in the sec- 
ond chapter of Part III. The extremely low (Hubble) frequency band (1 aHz~-10 fHz) 
GWs may also be on the verge of detection by CMB polarization observations; the 
present status is briefly reviewed in the introduction chapter of Part III. The low 
frequency band (100 nHz-100mHz) and the middle frequency band detections will 
have the greatest S/N ratios according to the present expectation. We review the 
sources, goal sensitivities, various mission proposals together with the current sup- 
porting activities in the third chapter of Part III. The GW quadrupole radiation 
formula has already been verified by the binary pulsar observations. In the next 
hundred years we will see great discoveries and immense focused activities toward 
the establishment and flourish of GW astronomy and GW cosmology. GW physics 
and GW astronomy will become a precision discipline in the coming century. 

The development of cosmology is most dramatic during the last hundred years. 
From Kapteyn universe in 1915 of observed disk star system of 10 kpc diameter and 
2kpc thickness with the Sun near its center to full-fledged precision cosmology now 
is monumental in the human history. It is fortunate that the development of obser- 
vational cosmology has GR, theory as a theoretical basis and goes hand-in-hand 
with the development of general relativity. This is fortunate both for observational 
cosmology and for GR. Using the Cosmological Principle Einstein looked into cos- 
mological solutions in GR in 1917. The fast development of observational distance 
ladder around that time soon extends the reach of astronomy to modern cosmos. 
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Studies in the fundamental issues on the origins of cosmos lead to anthropological 
principle, cosmic inflation, and cosmic landscape scenarios. The cosmos is believed 
to be open in (extended beyond) the Hubble distance scale. Part II consists of 
seven chapters: Martin Bucher and I present some introductory remarks with a 
discussion of missing mass-deficient acceleration issue in the first chapter; Marc 
Davis reviews the observation and evolution of cosmic structure; Martin Bucher 
give a rather comprehensive exposition of the physics (almost on every aspect of 
cosmology) of CMB; Xiangcun Meng, Yan Gao and Zhanwen Han review the SNla 
as a standardizable distance candle, its nature, its progenitors and its role in the 
cosmology together with related current issues; Toshifumi Futamase on the gravi- 
tational lensing in cosmology; K. Sato and Juni’ichi on cosmic inflation with a brief 
historical exposition on the development in Japan and Russia; David Chernoff and 
Henry Tye on inflation and cosmic strings from the point of view of string theory. 

The quest for a satisfactory quantum description of gravity began very early. 
Einstein thought that quantum effects must modify general relativity in his first 
paper on GWs in 1916. Klein argued that the quantum theory must ultimately 
modify the role of spatiotemporal concepts in fundamental physics in 1927. Part V 
on Quantum Gravity consists of 4 chapters. Chapter 20 gives a bird’s-eye survey on 
the development of fundamental ideas of quantum gravity together with possible 
observations of quantum gravitational effects in the foreseeable future. The classi- 
cal age (1958-1969; according to the chronological classification of Rovelli) started 
with ADM canonical formalism and concluded with DeWitt—Wheeler equation and 
DeWitt’s derivation of Feynman rules for perturbative GR. In the middle ages 
(1970-1983), the discovery of black hole thermodynamics and Hawking’s derivation 
of black hole radiation radically affected our understanding of general relativity. In 
the renaissance period (1984-1994), there are two influential developments. From 
the covariant approach, attempts to get rid of infinities merge into string theory. The 
use of strings and branes extends the theoretical framework of quantum field the- 
ory. From the canonical approach, background-independent loop quantum gravity 
emerged 20 years after DeWitt—Wheeler equation. In Chapter 21, Richard Woodard 
starts with experiences of two personal academic careers through the classical and 
middle ages, advocates that the cosmological data from the epoch of primordial 
inflation is catalyzing the maturation of quantum gravity from speculation into a 
hard science, explains why quantum gravitational effects from primordial inflation 
are observable, reviews what has been done in perturbative quantum gravity, tells 
us what the future holds both theoretically and observationally, and discusses what 
this tells us about quantum gravity. In Chapter 22, Steven Carlip reviews the discov- 
ery of black hole thermodynamics and summarizes the many independent ways of 
obtaining the thermodynamic and statistical mechanical properties of black holes. 
This has offered us some early hints about the nature of quantum gravity. Steven 
then describes some of the remaining puzzles, including the nature of the quantum 
microstates, the problem of universality, and the information loss paradox. In the 
last chapter, Dah-Wei Chiou gives us a rather self-contained introductory review 


Foreword ix 


on loop quantum gravity — a background-independent nonperturbative approach 
to a consistent quantum theory of gravity placing emphasis on the fundamental 
ideas and their significance. The review presents the canonical formulation of loop 
quantum gravity as the central topic and covers briefly the spin foam theory, the 
relation to black hole thermodynamics and the loop quantum cosmology with cur- 
rent directions and open issues summarized. 

Although we do not yet have a consistent calculable quantum gravity theory 
which has a good degree of completeness like quantum electrodynamics or quan- 
tum chromodynamics, the efforts to find one already led to the consistent renormal- 
ization of the gauge theory in 1960’s. The new development since 1980’s together 
with more understanding and further development of perturbation theory may give 
clues to a consistent theory. During these endeavors, the quest for a well-developed 
quantum gravity phenomenology including the quest to find a correct inflationary 
(or non-inflationary) scenario may play a significant role. 

The hope is that we will have one within a generation. This book is written and 
assembled for graduate students and general scientific-oriented readers alike. Each 
chapter is basically a review article. The five Parts are interconnected. Different 
combinations can be designed for special topics for graduate students and advanced 
undergraduates. For example, following combinations are suitable for each topic 
named: 


(i) Basics (Selected Topics in GR): Part I, Chapters 8, 9, 10, 13, 20; 
(ii) Empirical Foundations (Empirical Foundations of Relativistic Gravity): Chap- 
ter 2, Part I], Chapters 10, 11, 13, 14, 16, 20; 
(iii) Gravitational Waves: Chapters 2, 9, Part III, Chapters 15, 18, 19; 
(iv) Cosmology: Chapters 5, 6, 10, 12, Part IV, Chapter 20; 
(v) Quantum Gravity: Chapters 3, 4, 10, 18, 19, Part V. 


There can be various other combinations too. 

We are grateful to all contributors for agreeing to write comprehensive reviews 
to make this publication possible. We would also like to thank all the referees for 
their valuable comments and suggestions: Martin Bucher, Stephen Carlip, Dah- 
Wei Chiou, Sperello di Serego Alighieri, Angela Di Virgilio, John Eldridge, Jeremy 
Gray, Friedrich Hehl, Jim Hough, Ekaterina Koptelova, Ettore Majorana, James 
Nester, Ulrich Schreiber, Alexei Starobinsky, David Tanner, Richard Woodard, An- 
Ming Wu, Masahide Yamaguchi. We thank the World Scientific staff, especially 
Dr. K. K. Phua and Kah Fee Ng for their generous support in completing the book. 

We dedicate this two-volume GR centennial book to the founders of GR and 
various communities who have contributed to this dramatic century of development 
and applications of GR. 


Wei-Tou Ni 
November, 2015 
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Note Added in Proof 


After the foreword was written, LIGO Scientific and Virgo Collaborations 
announced in February 2016 and in June 2016 the first direct detections of gravita- 
tional waves (GWs) by LIGO Hanford and LIGO Livingston detectors in September 
2015 and in December 2015. With the LIGO discovery announcements, two impor- 
tant things are verified: (i) GWs are directly detected in the solar-system; (ii) Black 
holes (BHs), binary BHs and BH coalescences are discovered and measured exper- 
imentally and directly with the distances reached more than 1 billion light years. 
These discoveries constitute the best celebration of the centennial of the genesis of 
general relativity. We refer the readers to Refs. 1 and 2 for the discovery and Refs. 
3 and 4 for a brief history of gravitational wave research. 

A web page will be set up for updates of the reviews of these two volumes. 
Please see http://astrod-wikispaces.com/ for announcement. 
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Chapter 1, Fig. 1. System of five concentric spheres representing the motion of a planet (here the 
Earth) in Eudoxus’ mathematical representation. 
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Chapter 1, Fig. A.2. Sketch of the experimental device conducted by Michelson and Morley in 
1887. 
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Chapter 3, Fig. 7. Not quite seriously: “Schwarzschild” (left) versus “Kerr” (right). 
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Chapter 3, Fig. 12. Ergosurfaces, horizons, and singularity for slow, extremal (“critical”), and fast 
Kerr black holes. 
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Chapter 3, Fig. 15. Maximal analytic extension of the Kerr spacetime. 
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Chapter 7, Fig. 5. 
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Chapter 7, Fig. 10. Photography of the MeO laser station at Grasse (France) built by the end of 
the seventies for lunar laser ranging and redesigned for satellite and time transfer in 2005. 


Chapter 7, Fig. 12. Photography of units A (right) and B (left) of the T2L2 space instrument. 
The cylinders on the right are the detection modules (linear and nonlinear). The LRA module is 
not integrated into the photo. 
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Chapter 7, Fig. 13. CAO view of the whole Jason-2 satellite. T2L2 instrumentation is shared into 
two units A and B respectively outside and inside the satellite. 
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Chapter 7, Fig. 20. Earth to Moon one-way laser ranging with LRO through the LOLA of the 
spacecraft and some laser stations on ground. 


Chapter 7, Fig. 21. RF antenna. The Laser ranging receiver telescope is on the left from the center 
of the main RF antenna (red ellipse). LOLA is coupled with the telescope through an optical fiber. 
(Courtesy: NASA Goddard Space Flight Center). 


Chapter 7, Fig. 23. LTT equipment. On the left is the detector; in the middle lies the main 
electronic package including the event timer. (Courtesy: Shanghai Astronomical observatory). 
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Chapter 9, Fig. 8. Plot of companion mass (m2) versus pulsar mass (m1) for PSR J0737—3039 
with observed constraints interpreted in the framework of GR. The inset shows the central region 
at an expanded scale, illustrating that GR is consistent with all constraints (M. Kramer, private 
communication). 
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Chapter 9, Fig. 20. Limits on the relative energy density of the GWB, Qgyw at a GW frequency of 
2.8nHz based on the PPTA data sets, together with predictions for Qg@w based on several different 
models for the GWB.1!° The solid and dashed lines that are asymptotic to 1.0 at low Qgw show 
the probability Pr that a GWB signal of energy density QgGw can exist in the PPTA data sets, 
based on Gaussian and non-Gaussian GWB statistics respectively. The shaded region is ruled 
out with 95% confidence by the PPTA data. Corresponding limits from analysis of EPTA!%8 
and NANOGrav*° data sets, scaled to faw = 2.8nHz, are also shown. The Gaussian curves 
show the probability density functions pj, for the existence of a GWB with energy density New 
based on a merger-driven model for growth of SMBHs in galaxies,® an empirical synthesis of 
observational constraints on SMBHs in galaxies,!!% and based on the Millennium dark matter 
simulations!® together with semi-analytic models for growth of SMBHs in galaxies (see Ref. 119 
for more details). 
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Chapter 10, Fig. 3. Strain psd amplitude versus frequency for various GW detectors and GW sources. See Fig. 2 caption for the meaning of various 
acronyms. 
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Chapter 11, Fig. 8. Explorer resonant antenna in Rome. Source: Photo is reprinted from 
http: //www.romal.infn.it/rog/explorer/. 


Chapter 11, Fig. 11. AURIGA cryogenic resonant antenna is the twin of NAUTILUS, which is 
placed at Legnaro in Padova. Source: Photo is reprinted from http: //www.auriga.I|nl.infn.it. 


Chapter 11, Fig. 21. LIGO project started in 1994 to construct a pair of 4km baseline length scale 
facilities for laser interferometers separated by 3030 km, which were in Livingston, Louisiana and 
in Hanford, Washington. Source: These pictures are taken from Ref. 72. 
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Chapter 11, Fig. 22. During the initial LIGO project, it took about five years to attain the target 

design sensitivity after the installation. However, much more short period is expected to achieve 

the sensitivity of the advanced LIGO, the installation of which is finished in 2015. 


= 10°? 
N Ci Nov 2003 
= E C2 Feb 2004 
13 eh pees i Bevedendeeded H essectusenserdovandersbuageebed C3 Apr 2004 
= 10° Fi i C4 Jun 2004 
2 C5 Dec 2004 
= 404 C6 Aug 2005 
= ———~ €7 Sop 2005 
2 — WSRI Sep 2006 
8 10% — wsRrto Mar 2007 
—— VSRt May 2007 
16 ’ ‘ue j —— VSR2 October 2009 
40° orem es. races Me IY NIT Virgor dosign 
t 
40°7 
40°? 
40°9 
407° 
407! 
107? 
402 


10* 
C1 & C2: single arm ; C3 & C4: recombined ; C5 & after: recycled Frequency [Hz] 


Chapter 11, Fig. 26. Sensitivity improvement of Virgo. The sensitivity is inferior to that of LIGO 
at around mid frequencies. However, it is much better than LIGO at lower frequencies. Source: 
The figure is taken from a paper after VSR2 (see Ref. 153). 
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Chapter 11, Fig. 35. An end cryostat of CLIO placed underground at Kamioka mine. Thermal 
noise at cryogenic temperature, 10K, was achieved in 2009. 


Chapter 11, Fig. 37. KAGRA is a 3km baseline length power-recycled Fabry—Perot Michelson 
interferometer having RSE configuration with cryogenic mirrors, and is placed underground at 
Kamioka in Gifu prefecture. 
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Chapter 12, Fig. 1. Schematic of LISA-type orbit configuration in Earthlike solar orbit. 
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Chapter 12, Fig. 4. Strain PSD amplitude versus frequency for various GW detectors and GW 
sources. The black lines show the inspiral, coalescence and oscillation phases of GW emission from 
various equal-mass black-hole binary mergers in circular orbits at various redshift: solid line, z = 1; 
dashed line, z = 5; long-dashed line z = 20. See text for more explanation. [Cassini Spacecraft 
Doppler Tracking (CSDT); Supermassive Black Hole-GW Background (SMBH-GWB).] 
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Chapter 12, Fig. 2. Schematic of ASTROD-GW orbit configuration with inclination. Left, projec- 
tion on the ecliptic plane; Right, 3D view with the scale of vertical axis multiplied tenfold.?:4+ 
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Chapter 12, Fig. 5. Characteristic strain h. versus frequency for various GW detectors and sources. 
The black lines show the inspiral, coalescence and oscillation phases of GW emission from various 
equal-mass black-hole binary mergers in circular orbits at various redshift: solid line, z = 1; dashed 
line, z = 5; long-dashed line z = 20. See text for more explanation. [Cassini Spacecraft Doppler 
Tracking (CSDT); Supermassive Black Hole-GW Background (SMBH-GWB).] 
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Chapter 12, Fig. 3. Schematic (left) and artist’s conception (right) of the OMEGA mission 
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Chapter 12, page I-587. Normalized GW spectral energy density Qgw versus frequency for various 
GW detectors and GW sources. The black lines show the inspiral, coalescence and oscillation 
phases of GW emission from various equal-mass black-hole binary mergers in circular orbits at 
various redshift: solid line, z = 1; dashed line, z = 5; long-dashed line z = 20. See text for 
more explanation. [Cassini Spacecraft Doppler Tracking (CSDT); Supermassive Black Hole-GW 
Background (SMBH-GWB).] 
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The genesis of special relativity is intimately related to the development of the theory 
of light propagation. When optical phenomena were described, there are typically two 
kinds of theories: (i) One based on light rays and light particles and (ii) one consider- 
ing the light as waves. When diffraction and refraction were experimentally discovered, 
light propagation became more often described in terms of waves. Nevertheless, when 
attempts were made to explain how light was propagated, it was nearly always in terms 
of a corpuscular theory combined with an ether, a subtle medium supporting the waves. 
Consequently, most of the theories from Newton’s to those developed in the 19th century 
were dual and required the existence of an ether. We therefore used the ether as our 
Ariadne thread for explaining how the principle of relativity became generalized to the 
so-called Maxwell equations around the 1900’s. Our aim is more to describe how the 
successive ideas were developed and interconnected than framing the context in which 
these ideas arose. 


Keywords: Special relativity; ether; light propagation; electrodynamics; Maxwell equa- 
tions. 


“Comprendre la genése de la science |...] est indispensable 
pour Vintelligence complete de la science elle-méme” 


Henri Poincaré, La Science et l’Hypothése, p. 163, 1902. 


1. Introduction 


One of the very first contributions to the principle of relativity is due to Nicola 
Cusano (1401-1464), who addressed the question: “How would a person know that 
a ship was in movement, if, from the ship in the middle of the river, the banks were 
invisible to him and he was ignorant of the fact that water flows?” ! He was followed 
by Giordano Bruno (1548-1600), an Italian Dominican friar who was condemned 
for his theological positions which were, for instance, that “Christ was not God but 
merely an unusually skillful magician, that the Holy Ghost is the soul of the world, 
that the Devil will be saved, etc.”? In one of his books,? he considered the problem 
of someone dropping a stone from the top of a mast of a ship moving at a constant 
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velocity. Bruno thus stated that a difference exists between the motion of a ship 
and the motion of what it contains, otherwise, “one could never draw something 
along a straight line from one of its corners to the other, and that it would not be 
possible for one to make a jump and return with his feet to the point from where he 
took off”. Consequently [Ref. 3, p. 85], 


if someone was placed high on the mast of that ship, move as it may however 
fast, he would not miss his target at all, so that the stone or some other 
heavy thing thrown downward would not come along a straight line from 
the point E which is at the top of the mast, or cage, to the point D which is 
at the bottom of the mast, or at some point in the bowels and body of the 
ship. Thus, if from the point D to the point E someone who is inside the 
ship would throw a stone straight [up], it would return to the bottom along 
the same line however far the ship moved, provided it was not subject to 
any pitch and roll. 


Bruno was arrested by the Inquisition in Venice where he was up to May 22, 
1592. Galilei Galileo (1564-1642) arrived on March 23, 1592 at Padua University 
(The University of the Republic of Venice). What did Galileo actually learn from 
Bruno is not documented, neither whether they met each other, most likely due to 
Bruno’s condemnation, but similar ideas were then developed by Galileo through 
the numerous subjects he investigated during his career, and which can be consid- 
ered as premises to relativity as it was defined in 1905. 

While mostly associated with mechanics, the developments of the so-called spe- 
cial relativity were also performed in various scientific domains such as optics and 
electrodynamics in not always straightforward connections: It is therefore a little 
bit delicate to establish the genesis of this theory. However, when the evolution of 
ideas in electrodynamics and optics is redrawn to lead to the structural foundations 
of special relativity, a fundamental concept always occurred in a recurrent way and 
cannot be avoided: The ether. Always associated with propagative phenomena of 
any origin such as light propagation in optics as well as the flow of electrical charges 
in electrodynamics, the ether — this is the name used for designating the subtle 
medium in which propagation (of any kind) is described — is used in all theories 
developed during 200 years until 1905 with no exception. 

Consequently, the genesis of special relativity cannot be provided without dis- 
cussing the concept of ether; we choose to develop it from its origins. What is it? 
Where does it come from? In what and how does it intervene in wave propagation? 
All these questions were so often discussed during the early developments of special 
relativity that it would not be possible to evidence a certain consistency to the 
succession of ideas and discoveries without it. The ether was thus considered by us 
as an Ariadne thread. By investigating its fundamental role in optics and electro- 
dynamics, we were able to track how these theories fused and from which special 
relativity became the outcome. 
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The subsequent parts of this chapter are organized as follows. Section 2 is 
devoted to the origin of the ether and its first relationships with light. Section 3 
discusses Galileo’s composition law for velocities. We addressed the nature of light 
in Sec. 4 and we discussed how optical phenomena were linked with electromagnetic 
fields in Sec. 5. Section 6 is devoted to the invariance of the equations describing 
light propagation under a coordinate transformation between two frames with a 
uniform translation between each other. Section 7 discusses Poincaré’s contribution 
(most of this section was provided in Ref. 4) and Sec. 8 briefly describes Einstein’s 
1905 contribution. Section 9 gives a short conclusion. 


2. The Ether: From Celestial Body Motion to Light Propagation 
2.1. Its origin 


The concept of an ether — an elastic medium with subtle properties — was first 
introduced by Aristotle (—384 to —322)° for explaining the motion of planets and 
to propose a mechanical model based on the mathematical theory describing the 
motion of celestial bodies proposed by Eudoxus of Cnidus (—405 to —355). Accord- 
ing to the latter, a simple mathematical solution should exist to describe the com- 
plex motions of planets by combining simple circular motions, including their appar- 
ent irregularities (Ref. 6, p. 106 and Ref. 7, pp. 111-122). 

These complex motions result from the rotation of concentric spheres, each with 
a different velocity, the Earth being motionless in the center of that system. In order 
to reproduce the complexity of these motions, Eudoxus distinguished the uniform 
circular motion of the sphere holding the stars, which only requires a single circular 
motion, from the motions of the Sun, the Moon and the five planets which mostly 
take place in the ecliptic plane: The latter rotations are in the opposite direction 
(compared to the rotation of the stars). 

In this model, the solar and lunar motions are explained by combining the circu- 
lar motion of three concentric spheres, and those corresponding to the five planets 
(Venus, Mercury, Mars, Jupiter and Saturn) by combining the circular motion of five 
concentric spheres. For each system of concentric spheres, the most external sphere 
motion was induced by the movement of fixed stars, the other internal spheres being 
animated by contra-rotative motions. The most internal sphere of each system was 
supporting the corresponding celestial body (Fig. 1). 

For the planets, the three concentric spheres between the most external and the 
most internal spheres were introduced for reproducing visible movements (sphere 2) 
as well as stations and downgradings of planets (spheres 3 and 4) (Fig. 1). Since 
solar and lunar motions do not present retrograde phenomena, spheres 3 and 4 are 
useless and, consequently, their motions were only described by three concentric 
spheres. 

In order to take into account the seasonal variability, Callippus (born at Cyzicus) 
(—370 to —300) added two circular motions for describing the solar and lunar 
motions, and one for reproducing those of Mars, Venus and Mercury (Ref. 6, p. 111). 


1-6 V. Messager and C. Letellier 


Fig. 1. System of five concentric spheres representing the motion of a planet (here the Earth) in 
Eudoxus’ mathematical representation. (For color version, see page I-CP1.) 


The resulting mechanical model remains in agreement with Aristotle’s mechanics 
for explaining how the rotation of the spheres propagates from the stars up to the 
sublunar realm in the center of which is the Earth. In Aristotle’s world,® there 
are two subworlds, the supralunar realm where are located the spheres associated 
with stars, planets, the Sun and the Moon, and the sublunar realm bounded by 
the sphere holding the Moon. The origin of motion is thus divine by nature since 
heaven is commonly located at “the extremity or upper region, which we take to 
be the seat of all that is divine” (Ref. 5, Book 1, Sec. 9). It is thus propagated by 
friction from one sphere to the other. For Aristotle, motion is necessarily associated 
with body (Ref. 5, Book 1, Sec. 15) and, more generally, to matter (Ref. 5, Book 1, 
Sec. 2): 


All natural bodies and magnitudes we hold to be, as such, capable of loco- 
motion; for nature, we say, is their principle of movement. But all movement 
that is in place, all locomotion, as we term it, is either straight or circular 
or a combination of these two, which are the only simple movements. 


In addition to that, the four elements (earth, air, water and fire) from which every- 
thing belonging to the sublunar realm including the Earth is made, are animated 
of straight motion, either up or down with respect to the center of the Earth; since 
these motions cannot be sustained forever, the sublunar realm is subject to change, 
it is “corruptible”. In contrast to this, the supralunar realm, close to the divine 
entity, must be ungenerated, not corruptible and immutable: Celestial bodies are 
thus necessarily animated of circular motion, the simplest one to be periodic, that 
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is, to be repeated equal to itself forever. Such a perfect motion is, by essence, natural 
(nonforced) (Ref. 5, Book 1, Sec. 2): 


By simple bodies I mean those which possess a principle of movement in 
their own nature |...]. Supposing, then, that there is such a thing as simple 
movement, and that circular movement is an instance of it, and that both 
movement of a simple body is simple and simple movement is of a simple 
body [...] then there must necessarily be some simple body which revolves 
naturally and in virtue of its own nature with a circular movement. 


By definition, a body which naturally has a circular motion cannot be one of the 
four sublunar elements, nor made of them. Consequently, another substance must 
exist, a fifth element, which is naturally and eternally animated of a circular motion. 
Moreover, the (Ref. 5, Book 1, Sec. 2) 


circular motion is necessarily primary. For the perfect is naturally prior to 
the imperfect, and the circle is a perfect thing. This cannot be said of any 
straight line [...]. These premises clearly give the conclusion that there is 
in nature some bodily substance other than the formations we know, prior 
to them all and more divine than them. 


According to this principle stating that any motion is matter, vacuum cannot exist: 
Otherwise, for Aristotle who believed that velocity was inversely proportional to 
the medium density, the velocity could be infinitely large, a feature which was not 
conceivable. The fifth (divine) substance, associated with the circular motion of 
celestial bodies, can only be as the supralunar realm looks like, that is, ungenerated, 
not corruptible, “exempt from increase and alteration” (Ref. 5, Book 1, Sec. 3): 


For all men [who] have some conception of the nature of the gods, and [...] 
who believe in the existence of gods at all, whether Barbarian or Greek, 
agree in allotting the highest place to the deity, surely because they suppose 
that immortal is linked with immortal and regard any other supposition as 
inconceivable. [...] so, implying that the primary body is something else 
different from earth, fire, air and water, the Older gave the highest place a 
name of its own, aether, derived from the fact that it “runs always” for an 
eternity of time. 


The ether, a simple natural body, was thus associated with circular motion of 
celestial body: It was constituting the supralunar realm up to the heavens. It has 
a divine character and very particular properties (Ref. 5, Book 1, Part 3): 


The body [...] which moves in a circle cannot possibly possess either heav- 
iness or lightness. For neither naturally nor unnaturally can it move either 
towards or away from the center. 
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Aristotle wants not only to explain what produces the original motion, but also 
how it propagates through the universe. He thus explains that the motion of the 
primary sphere is responsible for all the others (Ref. 5, Book 11, Part 12): 


In thinking of the life and moving principle of the several heavens one must 
regard the first as far superior to the others. Such a superiority would be 
reasonable. For this single first motion has to move many of the divine 
bodies, while the numerous other motions move only one each, since each 
single planet moves with a variety of motions. 


Aristotle does not leave the center of circular motion without any associated body 
(Ref. 5, Book 11, Part 3): 


Earth then has to exist; for it is Earth which is at rest at the center [...]. 
Earth is required because eternal movement in one body necessitates eter- 
nal rest in another. 


This geocentric system, proposed by Aristotle and based on mathematical laws 
proposed by Eudoxus of Cnidus, remained well accepted for a long time, with very 
few exceptions as discussed, for instance, by Aristarchus of Samos (—310 to —230). 


2.2. The luminiferous ether 


For Aristotle, the ether was thus a medium propagating the circular motions that 
animate all celestial bodies. It became the medium for light propagation with the 
Aristotelian Bishop of Lincoln, Robert Grosseteste (1175-1253). For him, light is the 
“first corporeal form” (Ref. 8, p. 10), the “bodily spirit” (Ref. 8, p. 13), whose lack 
of determined properties allows it to be transformed into any substance, whatever 
its nature, and, consequently, into the four sublunar elements.® Being able to be 
diffused at infinity, light is “inseparable from matter” (Ref. 8, p. 13) which can thus 
propagate as sound or heat: Sound and heat therefore contain light. 
Consequently, the universe was created from light which (Ref. 8, p. 13) 


by extending first matter into the form of a sphere and by rarefying its out- 
ermost parts to the highest degree, actualized completely in the outermost 
sphere the potentiality of matter, and left this matter without any potency 
to further impression. And thus the first body in the outermost part of the 
sphere, the body which is called the firmament, is perfect, because it has 
nothing in its composition but first matter and first form [...]. When the 
first body, which is firmament, has in this way been completely actualized, 
it diffuses its light (lumen) from every part of itself to the center of the 
universe. 


Thus light, which is considered as “the perfection of the first body” (Ref. 8, p. 13) 
expanded and brought together from the first body [firmament] toward the center 
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of the universe, gathered together the mass existing below the first body. Moreover 
(Ref. 8, p. 13) 


since this light (ua) is a form entirely inseparable from matter in its diffu- 
sion from the first body, it extends along with itself the spirituality of the 
matter of the first body. 


As a consequence, the firmament is the place where light is the simplest (less dense), 
becoming denser and denser while approaching the center of the universe from the 
primary sphere to the “ninth and lowest sphere, the dense mass which constitutes 
the matter of the four elements” (Ref. 8, p. 14). Since vacuum does not exist, the 
firmament, the boundary of the world, must be finite. Matter can thus be expanded 
from earth to fire through water and air, light being what remains beyond the realm 
of fire. In Grosseteste’s world, the nine heavenly spheres are not subject to change, 
since already in the form of light, contrary to this, the sublunar realm can change, 
that is, an element can be transformed into another one by expansion (rarefaction) 
or by compaction (condensation). According to Grosseteste, matter can be viewed 
as a high compaction of light, matter being made from light. There is a kind of 
equivalence between matter (mass) and light. Higher (lighter) bodies are more spir- 
itual, and lower (heavier) bodies are more corporeal. Since the supralunar realm is 
eternal, there is no rarefaction nor condensation (of light) and, consequently, the 
sole possible motion is circular motion. It is interesting to note that the constraint 
on light motion determines the motion of the heavenly spheres. 

Grosseteste thus proposed a mechanical explanation for the motion of the 13 
spheres of the world (Ref. 8, p. 16): 


The higher body receives its motion from the same incorporeal moving 
power by which the higher body is moved. For this reason the incorporeal 
power of intelligence or soul, which moves the first and highest sphere with 
a diurnal motion, moves all the lower heavenly spheres with this same 
diurnal motion. But in proportion as these spheres are lower they receive 
this motion in a more weakened state, because in proportion as a sphere is 
lower the purity and strength of the first corporeal light is lessened in it. 


For Grosseteste, the first body light is nothing else than “unchangeable” (Ref. 8, 
p. 14), and “rarefied to the highest degree” (Ref. 8, p. 13): It is of divine spirit and 
responsible for the motions of planets. It has exactly the same identity as Aris- 
totle’s ether, leading to a unique entity named the luminiferous ether. Accord- 
ing to Grosseteste’s conceptions, light is present inside matter. The fact that 
light was “filling” matter was later explained by René Descartes (1596-1650) 
(Ref. 9, p. 5): 


I first suppose that water, earth, air and fire and any other body of our 
environment are made of many small parts of different shapes and sizes, 
which are never so well arranged nor so accurately joined together that it 
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remains always some intervals around them, and that these intervals are 
not empty but filled of this so subtle matter, by the means of which is 
propagated the action of light. Moreover, one must think that the sub- 
tle matter filling the intervals between parts of these bodies is of such a 
nature that it never ceases to move very rapidly but not exactly with the 
same speed in any place nor in any time, but it commonly moves slightly 
faster toward the Earth surface than it does towards the heavens and faster 
towards places close to the equator than towards poles, and at the same 
place, faster during summer than during winter, and during day than night. 


For Descartes, light is still made of bodies but is now responsible for the motion of 
the ether (Ref. 9, p. 6): 


Light is nothing else than a certain motion or an action whose luminous 
bodies push this subtle matter in a straight line in any direction around 
them. 


Descartes’s ether is required for propagating light as well as any other propagative 
phenomenon like, for instance, heat (Ref. 9, p. 7): 


Marbles and metals seem colder than woods, [and] one must think that 
their pores do not receive so easily subtle parts of this matter [ether]. 


Vibrations were mentioned by Nicholas Malebranche (1638-1715) who proposed 
to describe interactions between light and ether by considering ether as a fluid 
governed by vortices (as in Descartes’s work) (Ref. 10, p. 377): 


Since reflection and refraction of rays are not produced by the action of air 
nor glass during transition from one to the other, it is thus necessary that 
the cause comes from the action of the subtle matter, since there is here 
only air, glass and subtle matter. In order to explain the manner in which it 
happens, it must be remarked that any part of ether, or all small whirls from 
which I believe to have shown that it is made of, are also pressed on and at 
equilibrium each other or always tend toward to be so |...]. Let us assume 
that all these small whirls of ether are equally and as infinitely pressed 
on, and that they counterbalance each other by their centrifugal forces, as 
soon as the small parts of a luminous body squeeze the small whirls that 
are encountered, their pressure is communicated to all the others up to us, 
and doing so in an instant, because there is no vacuum. These small parts of 
a luminous body, by their various motions squeezing by shaking the whirls 
which are resisting, induce in them vibrations of pressure. And all these 
vibrations of pressure are made in a straight line, until they are in ether 
[...]. These rays cannot change in direction, but when they meet obliquely 
a glass surface, they are subject to a refraction and deviate towards the 
perpendicular line to this surface; this refraction is as large as the bodies 


A genesis of special relativity I-11 


in which they enter are more weighted and denser than those from which 
they are issued. 


Light propagation thus results from the interactions between luminous bodies con- 
stituting the light and the ether considered as a fluid whose whirls are at the origin 
of vibrating properties. Ether as light particles are in motion but only light is 
propagated at long distances. 


3. Galileo’s Composition Law for Velocities 


The very first to support the idea of an heliocentric system — following for instance 
Aristarchus of Samos — was Mikolaj Kopernik (1473-1543). He did not want to 
introduce a rupture in Aristotelian conceptions but rather to better match with 
Aristotle’s first principle according which celestial bodies are subject to circular 
motions. In fact, Kopernik was looking for a system where less numerous combi- 
nations of circular motions would have been required!!: In particular, he wanted 
to remove from his heliocentric system the equant that he considered as against 
Artistotle’s principles. In a certain sense, Kopernik was more a conservative than a 
revolutionary! In contrast to this, Galileo, when he compared the “two main chief 
world systems” ,!2 was more motivated by evidencing the weaknesses of Aristotle’s 
rationale than by proving the correctness of the heliocentric system. In doing this, 
Galileo started to discuss some problems related to what is now called the principle 
of relativity. 

Thus, in the second day of his discourses, Galileo remarked that one is unable 


to detect relative motions!?: 


Whatever motion comes to be attributed to the Earth must necessarily 
remain imperceptible to us and as if nonexistent, so long as we look only 
at terrestrial objects; for as inhabitants of the Earth, we consequently par- 
ticipate in the same motion |... ]. 

Motion, in so far as it is and acts as motion, to that extent exists 
relatively to things that lack it; and among things which all share equally 
in any motion, it does not act, and is as if it did not exist. 

It is obvious, then, that motion which is common to many moving 
things is idle and inconsequential to the relation of these movables among 
themselves, nothing being changed among them, and that it is operative 
only in the relation that they have with other bodies lacking that motion, 
among which their location is changed. 


Galileo thus proposed a procedure to evidence the motion of Earth: 


The true method of investigating whether any motion can be attributed to 
the Earth, and if so what it may be, is to observe and consider whether 
bodies separated from the Earth exhibit some appearance of motion which 
belongs equally to all. For a motion which is perceived only, for example, 
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in the Moon, and which does not affect Venus or Jupiter or the other stars, 
cannot in any way be the Earth’s or anything but the Moon’s. 

Now there is one motion which is most general and supreme over all, 
and it is that by which the Sun, Moon and all other planets and fixed 
stars — in a word, the whole universe, the Earth alone excepted — appear 
to be moved as a unit from east to west in the space of twenty-four hours. 
This, in so far as first appearances are concerned, may just as logically 
belong to the Earth alone as to the rest of the universe, since the same 
appearances would prevail as much in the one situation as in the other. 


He then clearly addressed the problem of describing a motion in a frame at rest and 
compared it to its description in a frame animated with a uniform translation!?: 


If a stone is dropped from the top of a mast with a large velocity falling 
down exactly at the same place of the ship as if it would have been at 
rest, how can this fall serve you to decide whether the ship is at rest or in 
motion? 

[...] the same argument being valid for the ship as for the Earth, one 
cannot be conclusive about the motion or the rest of the Earth. 

[...] with respect to the Earth, the tower and to us, that all are moving 
with the daily motion, simultaneously to the stone, the daily motion is as 
it was nothing, it remains indifferent, not perceptible, and has no action; 
only observable to us is the motion that we are lacking, the motion of the 
stone which skims along the tower while falling. 


Thus, nobody can perceive a motion when it is not related to a reference frame 
which is not animated by this motion. Many other examples of relative motions 
were developed by Galileo to support his explanations. As an ultimate experiment, 
he proposed!?: 


Let be with a friend in the largest cabin under the deck of a large ship and 
take with you some flies, butterflies and other small animals that are flying, 
take also a large vessel filled with water with small fishes, hang on also a 
small bucket from which water is flowing droplet per droplet in another 
vase with a small aperture in its bottom. When the ship is at rest, observe 
carefully how these small flying animals move with the same velocity in any 
direction of the cabin, how fishes equally swim in any direction |...]; if you 
bunny hop, as we say, you will move by equal distances in any direction. 
When you will have carefully observed this, although that there is no 
doubt that it must occur as the ship was at rest, let the ship move at the 
velocity you want — until the motion is uniform without pitch or roll, you 
will not remark the least change in all the effects that we just described. 
None will allow you to detect whether the ship is moving or is at rest [...]. 
By jumping, you will move on the floor by the same distances as before, and 
you will not jump further towards the bow or the stern because the ship 
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will move very fastly; however, during the time you were in the air, the floor 
below you runs in the direction opposite to your jump |...], butterflies and 
flies will continue to fly equally in all directions, you will never see them 
taking refuge towards the wall in the stern as they were tired to follow the 
fast move of the ship |...]. If these effects match each other, it results from 
the fact that the ship motion is shared with everything it contains as well 
as to the air; this is why I asked you to be under the deck. 

If you have be on the deck, the air would not follow the ship run and 
we would have observed more or less noticeable differences. 


Galileo thus established that it is impossible to assert whether a body with whom we 
shared a common uniform motion is moving or not. He also argued against Aristotle 
for the nonexistence of vacuum. This can be seen as a first attempt to remove the 
ether but this is not performed in the context of propagative phenomena: Galileo 
therefore had no need to consider an ether! While investigating falling bodies in 
medium, he understood that only the density of the medium is responsible for 
various velocities. Extrapolating such a result in medium with a null density, he 
came to the conclusion that “if we had fully removed the medium resistance, all 
bodies would fall at the same velocity”. Vacuum thus became for Galileo an ideal 
frictionless medium for investigating motion. 

His repeated studies on mechanical laws governing falling bodies led him, in the 
early 1630’s, to investigate the motion of a ball rolling on and falling from a table. 
He thus understood that the parabola described by the ball once it had left the 
table was the result of a combination of two motions as follows: “A projectile which 
is carried by a uniform horizontal motion compounded with a naturally accelerated 
vertical motion describes a path which is a semiparabola” .1° 

From these observations, the composition law for velocities was later formulated 
and became designated as the principle of Galilean relativity. Nevertheless, if it is 
common in mechanics, the Galilean composition law was quickly questioned when 
applied to the propagation of light. Light propagation was considered as infinitely 
fast, if not instantaneous. Galileo seems to be one of the first to imagine an exper- 
iment for measuring the light velocity!: 


Let each of two persons take a light contained in a lantern, or other recepta- 
cle, such that by the interposition of the hand, the one can shut off or admit 
the light to the vision of the other. Next, let them stand opposite each other 
at a distance of a few cubits and practice until they acquire such skill in 
uncovering and occulting their lights that the instant one sees the light of 
his companion he will uncover his own. After a few trials the response will 
be so prompt that without sensible error [svario] the uncovering of one light 
is immediately followed by the uncovering of the other, so that as soon as 
one exposes his light he will instantly see that of the other. Having acquired 
skill at this short distance let the two experimenters, equipped as before, 
take up positions separated by a distance of two or three miles and let 
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them perform the same experiment at night, noting carefully whether the 
exposures and occultations occur in the same manner as at short distances; 
if they do, we may safely conclude that the propagation of light is instanta- 
neous; but if time is required at a distance of three miles which, considering 
the going of one light and the coming of the other, really amounts to six, 
then the delay ought to be easily observable. 


Galileo thus concluded that if the light propagation is not instantaneous, “it is at 
least extremely fast, nearly immediate” 1° But Galileo did not mention any value 
for the light velocity. 

The first evaluation of light velocity was provided by Ole Roemer (1644-1710) 
who used the eclipses of Jupiter’s first satellite.!+ In order to assess the light velocity, 
Roemer supposed that, according to the additive law for velocities, when the Earth 
moves toward Jupiter during its revolution around the Sun, the light should take less 
time for traveling from Jupiter to the Earth than when it moves in the opposite 
direction (six months later) (Fig. 2). The results of Roemer’s experiments were 
equivalent to a light velocity equal to 2-108m-s~!. 

Later, James Bradley (1693-1762) showed that the phenomenon investigated by 
Roemer, that he named aberration, was an annual motion shown by all stars and 
which was resulting from the combination of the Earth’s velocity around the Sun — 
which is also the velocity of the observer — with the light velocity. Since a single 
apparent motion was shown by all stars, the light velocity should be the same for 
every star and should be uniform over any distance between the celestial body from 
which it is issued and the Earth.!° The aberration was thus correctly explained with 


Fig. 2. Light velocity measurement from the Jupiter’s first satellite, with A the Sun, B Jupiter 
and C the first satellite which enters the shade of Jupiter to go out of it at D, and EFGHKL the 
Earth placed at different distances from Jupiter (Ref. 14, p. 234). 
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a single light velocity — the same for light issued from every star — and with the 
classical composition law for velocities. Such a conclusion was not widely accepted, 
in particular by those who were considering light as made of corpuscles such as 
John Michell or Francois Arago as we will see later; for them, light velocity should 
depend on stellar size which was directly related to their distance from Earth. 


4. Questioning the Nature of Light: Waves or Corpuscles? 


By the end of the 17th century, two main theories concerning the nature of light 
were discussed. For both of them, the existence of a luminous ether was not ques- 
tioned and light (regardless of wave or particle) must interact with the ether to be 
propagated. One of the two main theories was developed by Christiaan Huygens 
(1629-1695)!° and was based on Grosseteste’s ideas: Mostly light was considered 
as a power propagating through an ether. It led to the wave theory that Thomas 
Young (1773-1829) used for explaining diffraction and interference patterns.!”1!§ 
The second theory was pushed by Isaac Newton (1643-1727) who considered light 
as made of small weighted corpuscles: This is the corpuscular theory also named the 
emission theory. Nevertheless, as we will see, these two theories are both mixed in 
the sense that they are not fully based on waves nor corpuscles, but rather combined 
the two conceptions. 

The corpuscular theory of light was mainly developed in the queries written by 
Newton at the end of his book Optiks!® which is mostly devoted to the reflection 
and refraction of light rays as well as to a color theory. Newton only addressed 
the nature of light when he questioned the production of light as follows (Ref. 19, 


Query 8): 


Do not all fix’d Bodies, when heated beyond a certain degree, emit Light 
and shine; and is not this Emission perform’d by the vibrating motions of 
their parts? 


Heat and light are here clearly related and associated with vibrations of what 
we would call today atoms or molecules. Indeed, Newton questioned interactions 
between light and matter and not only between light and ether (Ref. 19, Query 5): 


Do not Bodies and Light act mutually upon one another; that is to say, 
Bodies upon Light in emitting, reflecting, refracting and inflecting it, and 
Light upon Bodies for heating them, and putting their parts into a vibration 
motion herein heat consists? 


Nevertheless, Newton suggested that light —- when rays are considered — would 
be made of small corpuscles (Ref. 19, Query 29): 


Are not the Rays of Light very small Bodies emitted from shining Sub- 
stances? For such Bodies will pass through uniform Mediums in right Lines 
without bending into the Shadow, which is the Nature of the Rays of Light. 
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They will also be capable of several properties, and be able to conserve 
their Properties unchanged in passing through several Mediums, which is 
another Condition of the Rays of Light. 


Light rays propagate in straight lines through media but an ether is not specified 
here. Some periodic vibrations are also associated to light propagation, today one 
would say waves, as clearly suggested by the analogy with a falling stone in water 
(Ref. 19, Query 17): 


If a stone be thrown into stagnating Water, the Waves excited thereby 
continue some time to arise in the place where the Stone fell into the Water, 
and are propagated from thence in concentrick Circles upon the Surface of 
the Water to great distance. And the Vibrations or Tremors excited in 
the Air by percussion, continue a little time to move from the place of 
percussion in concentrick Spheres to great distances. And in like manner, 
when a Ray of Light falls upon the Surface of any pellucid Body, and is there 
refracted or reflected, may not Waves of Vibrations, or Tremors, be thereby 
excited in the refracting or reflecting Medium at the point of Incidence, and 
continue to arise there, and to be propagated from thence as long as they 
continue to arise and be propagated, when they are excited in the bottom 
of the Eye by the Pressure or Motion of the Finger, or by the Light which 
comes from the Coal of Fire in the Experiments abovemention’d? And 
are not these Vibrations propagated from the point of Incidence to great 
distance? And do they not overtake the Rays of Light, and by overtaking 
them successively, do they not put them into the Fits of easy Reflection and 
easy Transmission described above? For if the Rays endeavour to recede 
from the densest part of the Vibration, they may be alternately accelerated 
and retarded by Vibrations overtaking them. 


Newton’s theory of light propagation uses the corpuscular as well as the undula- 
tory nature of light depending on the medium in which it propagates. Interactions 
between light and matter (here seen as a refracting or reflecting medium) is thus 
associated with waves that can propagate over great distances. Moreover, vibra- 
tions (waves) are possible for any deflection from the straight line as specified in 
Query 2919: 


Nothing more is requisite for putting the Rays of Light into Fits of easy 
Reflection and easy Transmission, than that they be small Bodies which 
by their attractive Powers, or some other Force, stir up Vibrations in what 
they act upon, which Vibrations being swifter than the Rays, overtake them 
successively, and agitate them so as by turns to increase and decrease their 
Velocities, and thereby put them into those Fits. 


Moreover there is a force, not specified, responsible for the change in light 
propagation. 
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For a mechanical explanation of wave propagation, a medium, different from the 
air, is required: Such a medium, the ether, can be found in matter (light propagates 
through glass) as well as in the heavens (Ref. 19, Query 18): 


Is not the Heat of the warm Room convey’d through the Vacuum by the 
Vibrations of a much subtler Medium than Air, which after the Air was 
drawn out remained in the Vacuum? And is not this Medium the same 
with that Medium by which Light is refracted and reflected, and by whose 
Vibrations Light comunicates Heat to Bodies, and is put into Fits of easy 
Reflexion and easy Transmission? [...] And is not this Medium exceeding 
more rare and subtile Air, and exceeding more elastik and active? And 
doth it not readily pervade all Bodies? And is it not (by its elastik force) 
expanded throughout all the Heavens? 


Ether would be rarer in dense matter (light propagates less easily in those bodies) 
than in the empty celestial space. Newton stated that “light moves from the Sun to 
us in about seven or eight minutes of time” (Ref. 19, Query 21) as “this was observed 
first by Roemer, and then by others, by means of the Eclipses of the Satellites of 
Jupiter (Ref. 19, Prop. XI), the actual value being 8.3 min. 

Ether was also made of particles which were smaller than those of light, being 
in turn smaller than those of air; but Newton did not know what these ethereal 
particles could be. He thus proposed two properties attributed to the ether: One 
is resulting from the surface forces, thus inducing that the ether is “less able to 
resist the motions of Projectiles” (Ref. 19, Query 21) and one due to the volume 
forces, leading to the fact that the ether is “exceedingly more able to press upon 
gross Bodies, by endeavouring to expand itself” (Ref. 19, Query 21). In Query 31, 
Newton suggested that interactions between “small particles of bodies” could be in 
a certain way similar to those between light rays: He even suggested that “a more 
attractive power” than gravity, magnetism and electricity could exist. Light might 
be not a special phenomenon but one of the phenomena in nature governed by 
similar laws as matter. Thus, in his 29th query, he discussed the double refraction 
investigated by Huygens?® in Island Crystal* and proposed to attribute “some kind 
of attractive virtue lodged in certain sides both of the rays and of the particles of 
the Crystal” (Ref. 19, Query 29), two sides — acting “as the Poles of two Magnets 
answer to one another” (Ref. 19, Query 29) — explaining the “usual refraction” , 
and two other sides responsible for the “unusual refraction”. According to Newton, 
the double refraction observed in Island Crystal could not be explained with a wave 
theory (Ref. 19, Query 28): 


Are not all Hypotheses erroneous, in which Light is supposed to consist in 
Pression or Motion, propagated through a fluid Medium? |...] To explain 


®We choose to use the term “Island” as used by Newton, Malus or Huygens and not “Iceland” as 
sometimes encountered. 
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the unusual Refraction of Island Crystal by Pression or Motion propagated, 
has not hitherto been attempted (to my knowledge) except by Huygens, 
who for that end supposed two several vibrating Mediums within that 
Crystal. But when he tried the Refractions in two successive pieces of that 
Crystal, and found them such as is mention’d above; he confessed himself at 
a loss for explaining them. |...] To me, at least, this seems inexplicable, if 
Light be nothing else than Pression or Motion propagated through Aether. 


According to Newton, light is made of bright corpuscles of very small size, 
possessing properties similar to magnetism, and whose propagation is induced by 
shaking the ether. The vibrations of the ether so produced allow, by the action of 
the forces of attraction and repulsion between the bright corpuscles and etherous 
molecules, to carry the light corpuscles from a point to another one. Newton’s ether 
was thus a medium with the same properties as a fluid could have: (i) Seen as a 
continuous medium allowing wave propagation and (ii) made of particles allowing 
him to describe its behavior using forces in a mechanical description according to 
his own mechanical laws. Such a mechanical description was the ultimate aim as 
evidenced by Jean-Baptiste Biot (1774-1862) while discussing the double refraction 
(Ref. 20, p. Xv): 


You might thus [...] examine the phenomena under all its sides [...]. The 
law for this phenomenon is still only established in an experimental manner; 
it can only be viewed as exact in the limit of accuracy of the experiments. 
In order to make it fully sure and rigorous, it must be brought back to the 
general laws of mechanics, that is, to obtain it from general conditions for 
motion and equilibrium as required by these laws. Because, when such a 
reduction can be fully performed, it necessarily evidences the character of 
forces by which these phenomena are produced, that is the last end where 
science can go. 


Working according to this paradigm, Francois Arago (1786-1853), supporting a 
corpuscular theory of light, was investigating since 1809 the effect of light velocity 
on refraction, one of the most studied optical phenomena. Arago was also aware of a 
paper published in 1783 by John Michell (1724-1793) in which light was considered 
as being made of small particles. When emitted by a star, these light particles 
would have their velocity reduced by the gravitational field as any other material 
particle and, consequently, the light emitted by stars should depend on their sizes.?! 
Bradley’s results showing the same aberration for all the stars was therefore only 
due to a lack of accuracy in his observations. Arago wanted to investigate in a deeper 
way Bradley’s aberration by using prisms that he considered (with Michell) as more 
adequate than direct observation due to their sensitiveness “to slight equalities” . 
Arago hoped to be able to determine whether the size of stars — “a circumstance 
which must produce significant differences in [light] velocities emanating from these 
various bodies” — affects the light velocity or not (Ref. 22, p. 40): 
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Since the deviation of the light rays penetrating sidewise in diaphanous 
bodies is a given function of their initial velocity, we will see that the 
observation of their total deviation while passing through a prism provides 
a natural measure of their velocities. 


In other words, the deflection of light rays in a refringent medium as a prism, should 
depend on their velocity in it and, consequently, on their initial velocities. These 
experiments led Arago to conclude that (Ref. 22, p. 40): 


Light moves with the same velocity not depending on the bodies from which 
it is issued or, at least, if there exists some differences, they cannot, in any 
way, alter the exactness of astronomical observations. 


Arago tried to show, “by direct experiments, that there is an increase in the velocity 
required by light rays during a switch from a rare medium into a dense medium” 
(Ref. 22, p. 41). He used for doing this the property that “an inequality in velocity 
produces an inequality in deflection, a fact which directly results from Newton’ s 
explanation for refraction” (Ref. 22, p. 41). He also considered how refraction was 
affected by the velocity of the body through which the light was passing (Ref. 22, 
p. 42): 


The difficulty presented |...] by the verification of Newton’s theory results 
from the principle which is a consequence of it follows: Light velocity, in 
any diaphane medium, must be the same, for any kind and number of 
media previously traversed. One can however remark that, when refrin- 
gent bodies are in motion, refraction induced by a body must no longer 
be computed with the [light] absolute value, but with this same velocity 
augmented or reduced by the velocity of the body, that is, with relative 
velocity of the ray; motions that we can give to bodies on the Earth being 
far too small to sensitively influence light refraction, one might search in 
much faster planet motions, some circumstances appropriate for making 
significant these inequalities in refraction. 


Arago thus considered that Earth’s motion combined with his experiment could 
lead to sensitive enough differences. 

In order to obtain a sensitive measurement of the deflection of light rays accord- 
ing to their velocity, Arago performed an experiment using an achromatic prism 
for avoiding chromatic aberration and, consequently, which should allow a better 
spectral separation of refracted rays without diffusion among them. By measuring 
“distances at zenith of a large number of stars” (Ref. 22, p. 44), thus using light 
rays with supposed different velocities due to various sizes of the observed stars, 
significant deflections through the prism were expected. In spite of this, measured 
deflections in these experiments were of the same order of magnitude as in experi- 
ments using direct observations. Arago expected differences by 1/10,000 induced by 
the Earth rotation. Contrary to this, he observed no difference in his observations, 
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“rays from all stars being affected by the same deflections” (Ref. 22, p. 46). Arago 
thus concluded (Ref. 22, p. 47): 


This result seems to be, at first, in a clear contradiction with the Newtonian 
theory of refraction because an actual inequality in the velocity of the rays 
does not imply an inequality in deflections that are induced. 


Not ready to leave Newton’s theory, Arago justified his negative result by arguing 
that (Ref. 22, p. 47) 


one can only explain [these results] by assuming that luminous bodies emit 
rays with any kind of velocities, provided that one also admits that these 
rays are visible only when their velocities are between given limits: Accord- 
ing to this hypothesis, indeed, the visibility of rays will depend on their 
relative velocities and, since their equal velocities determine the quantity 
of refraction, visible rays are equally refracted. 


In spite of this, Arago remained unable to provide a convincing explanation. He 
then asked to his friend, Augustin Fresnel (1788-1827) to address this problem using 
the wave theory that he defended. Fresnel thus provided an explanation considering 
“light as in the vibrations of a universal medium” (Ref. 23, p. 628), the so-called 
ether. For Fresnel, “the velocity with which waves are propagating is independent 
from the motion of the body from which they are emitted” (Ref. 23, p. 628). But he 
was not able to explain the stellar aberration with the theory according to which 
the Earth “induces its motion to the surrounding the ether” (Ref. 23, p. 628) and 
that would easily lead one to conceive “why the same prism always refracts light in 
the same manner, for any side from where it comes” (Ref. 23, p. 628). Fresnel only 
obtained an explanation of the phenomena by assuming that the ether goes freely 
through the globe, and that “the velocity communicated to this subtle fluid is only 
a small part of the velocity of the Earth” (Ref. 23, p. 628), typically less than one- 
hundredth. This line of reasoning led to the assumption for a partial driving by the 
motion of the Earth of the ether contained in transparent medium, an explanation 
in agreement to “the extreme porosity of bodies” (Ref. 23, p. 628), then supported 
by the most important scientists. 

Newton’s theory was, in his main concepts, a dual theory combining waves 
and particles. But as evidenced by Young, a wave theory had some indefectible 
advantages for explaining some optical phenomena. That is what Fresnel summed 
up as “wave theory is more adaptable to all phenomena than Newton’s theory” 
(Ref. 24, p. 12, Sec. 7); indeed, Newton had to multiply the assumptions to describe 
optical phenomena using particles (Ref. 24, p. 12, Sec. 5): 


Situations with easy reflexion and easy transmission are nearly non explain- 
able in Newton’s system. Thus, he presents them as new properties of light, 
and does not try to link them to the basis of his theory. It seems to me that 
these periodic variations in light disposals would be easier to conceive by 


A genesis of special relativity I-21 


considering light as produced by vibrations of calorific medium, because, 
in the same wave, it would have successively different velocities, different 
degrees of pressure and would repeat itself in the next undulations. 


Considering Newton’s explanation on the double refraction observed in the Island 
Crystal, Fresnel added (Ref. 24, p. 12, Sec. 6): 


The double refraction forced Newton to make again a new assumption, 
which is quite extraordinary; luminiferous molecules have poles, and the 
Island Crystal turns in the same direction poles of the same kind. Malus 
proved, by his beautiful experiments on the polarization of light, that it 
was modified in the same way as it is reflected with a certain angle by a 
non-tined mirror. It is necessary to admit poles in luminiferous molecules to 
conceive this phenomenon, and might it not be possible to assume that the 
mirror impose vibrations to the light, along the reflexion plane, a particular 
modification, which makes that it is more able to be reflected in a direction 
than in the other? 


Based on his objections against the Newtonian theory of light, in particular for the 
explanation of reflection and refraction, Fresnel began a series of meticulous experi- 
ments on diffraction whose principle was similar to those conducted by Newton, and 
which allowed him to develop a first theory for diffraction using light considered as 
waves.24 Some difficulties were still remaining as those encountered for explaining 
reflexion and diffraction — which were only explained using a corpuscular theory — 
as well as the polarization of light. Fresnel wanted to overcome this difficulty and 
to explain all optical phenomena (Ref. 25, p. 4): 


New phenomena, compared to those previously known, daily increase the 
probabilities in favor of a system of undulations. Although neglected for a 
long time, and more difficult to follow on its mechanical consequences than 
the emission theory, [a wave theory] already provides larger computations 
abilities [...]. It is interesting for improvements in optics and everything 
related, that is, the whole physics and chemistry, to know whether luminous 
molecules are launched from bodies which are sending light on us up to our 
eyes, or whether light is propagated by vibrations of an intermediary fluid 
to which particles of these bodies transfer their oscillations. 


The wave theory was very rarely used at that time but had received “a strik- 
ing confirmation” by “curious experiments” (Ref. 25, p. 4) conducted by Thomas 
Young. Comparing the advantages and disadvantages of a corpuscular theory to 
those provided by a wave theory, Fresnel thus evidenced the difficulties encoun- 
tered in explaining Young’s experiments and concluded: “Diffraction phenomena 
are not explainable with the emission theory” (Ref. 25, p. 33). Consequently, light 
must be considered as a wave. During its propagation in an elastic medium, such 
a wave gives to the molecules of the etherous fluid an oscillatory motion to which 
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it is possible to apply laws from mechanics and to explain the manner in which 
interfringes are produced (Ref. 25, p. 36): 


It is sufficient that these movements are oscillatory, that is, carry molecules 
alternatively in two opposite directions, to have the effect of a wave series 
destroyed by the effect of another series with the same intensity [...]. For 
instance, in waves formed at the surface of a liquid, oscillations are ver- 
tical, and propagation is horizontal and, consequently, along a direction 
perpendicular to the first one. 


Considering the etherous fluid being homogeneous and isotropic, light waves 
propagate with a constant velocity because “the velocity for propagation (which 
should not be mistaken with the absolute velocity of molecules) only depends on the 
density and the elasticity of the fluid” (Ref. 25, p. 37). Light intensity therefore 
depends on the vibration intensity of the ether “and color will depend on the dura- 
tion of each oscillation or on the wave-length since one is proportional to the other” 
(Ref. 25, p. 43). Fresnel completed “the foundations to build the general theory of 
diffraction” (Ref. 25, p. 59) by investigating Huygens’ principle according to which 
(Ref. 25, pp. 59-60): 


Vibrations of a light wave in each of its points can be seen as the result of 
elementary movements sent at the same time, acting independently, by all 
parts of this wave considered in any one of its previous positions; 


and that Fresnel considered as “a rigorous consequence” of the system of the undula- 
tions. These elementary movements, also called shakings (ébranlements) by Fresnel, 
have certain properties as follows (Ref. 25, p. 60): 


I will assume that these shakings, in an infinite number, are all of the same 
kind, occur simultaneously, are adjoining and located in the same plane 
or in the same spherical surface. I will make yet an assumption related 
to the nature of these shakings: I will assume that the velocities given 
to molecules are all oriented in the same direction, perpendicular to the 
spherical surface, and are however proportional to condensations, in a ratio 
such that molecules cannot have retrograde movements. I will have thus 
reconstructed a wave from its partial shakings. 


The relevant aspect of Huygens’ principle is that any wave can be decomposed in 
“elementary” or “partial” movements. This is the key concept which allows Fresnel 
to understand that double refraction is the signature of the decomposition of a wave 
into two different components with different velocities (Ref. 25, p. 93): 


The difference between the squares of the propagation velocities of ordinary 
and extraordinary rays is proportional to the square of the sine of the angle 
that there is between each their direction and the [crystal] axis. It results 
from these facts that the two beams produced by the double refraction do 
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not have the same optical properties around their direction, since they are 
affected either by the ordinary refraction or by the extraordinary refraction, 
depending on whether the main section of the second crystal is oriented 
along a certain plane or perpendicular to this plane. If we draw straight 
lines perpendicular to rays along these planes, and if we conceive them as 
carried by the set of waves in its course, they will indicate the two directions 
in which [this crystal] presents opposite optical properties. 


Fresnel then explained how there could be different properties of the waves in the 
two beams made of (Ref. 25, p. 96) 


transverse movements (I call transverse movements the oscillations of eth- 
erous molecules that would occur perpendicular to the direction of rays) 
which could not be the same in the two directions [...]. This is not only by 
its path through a crystal which splits it into two distinct beams that light 
receives this particular modification: It can be also polarized by a simple 
reflexion on a surface of a transparent surface, as Malus was the first to 
observe. 


Fresnel thus built a mechanical description of light propagation by treating the 
velocities of oscillations as they are treated in classical problem, that is, a veloc- 
ity has some components which can be isolated or determined. Nevertheless, in 
mechanics forces are responsible for motion. In the case of light, since there is no 
force yet identified, these velocities associated with the “elementary movements” 
seem for Fresnel to play the role of forces. Such a status thus explains why it was so 
difficult for him to establish that these oscillations, then considered as responsible 
for the motion (propagation) of light, were perpendicular to the direction of propa- 
gation. These directions are thus needed for describing light propagation: The main 
one corresponds to the axis along which the propagation occurs; perpendicular to 
it, there is the plane in which there are the two components evidenced by using the 
double refraction. These “elementary movements” are perpendicular to light rays 
(Ref. 25, p. 101): 


the two components of the velocity are also proportional to sinz and 
cos, according to the principle of composition and decomposition of small 
motions in fluid, which must be [decomposed] as forces in statics. Malus’ 
law seems thus to indicate that oscillatory movements of etherous molecules 
arise perpendicularly to rays. 


Fresnel was thus able to show that light was associated with wave propagation made 
of transverse oscillations which can be decomposed (polarized) into two components. 
It is interesting to note (i) that the term “polarization” — first introduced by Malus 
to designate this light property — comes from the first attempt to explain the 
double refraction using the “poles” of magnets and (ii) that Fresnel pushed a little 
bit further the wave theory by understanding that the two components were not 
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poles but particular directions along which elementary oscillatory movements can 
be decomposed. Nevertheless, Fresnel attributed these oscillations to an universal 
fluid — the ether — as all his predecessors and understood that its oscillations more 
than its molecules were taking part in light—matter interactions (Ref. 25, p. 141): 


If light is only a certain type of vibrations of an universal fluid, as diffraction 
phenomena show it, one must no longer assume that its chemical action on 
bodies consists in a combination of its molecules with their own ones, but in 
a mechanical action that vibrations of this fluid are exerted on ponderable 
particles, and that force them to new arrangements, to a new more stable 
equilibrium system, for the type or the energy of vibrations to which they 
are expressed. 


Thus, Fresnel’s works on diffraction extended to bright phenomena led to the estab- 
lishment of a rigorous theory of light which was considered as a transverse polarized 
wave propagating through oscillations of etherous molecules according to laws of 
mechanics. 


5. From Electrodynamics to Light 
5.1. Ampeére’s law 


On the one hand, Fresnel’s contribution widely convinced that wave theory had 
serious advantages in explaining optical phenomena. On the other hand, magnetic 
and elastic forces were briefly suggested by Newton to explain some optical phe- 
nomena but no conclusive explanations were provided. Nevertheless, in both cases, 
laws governing optical phenomena were not yet established, nor were those for elec- 
trical and magnetical phenomena. One noticeable contribution for the latter was 
provided by André-Marie Ampére (1775-1836) who proposed an empirical law for 
describing the reciprocal action between two current elements. Following Newton’s 
methodology, Ampére’s approach consists in (Ref. 26, p. 176): 


First observing the facts, varying their circumstances as much as possi- 
ble, joining to this first work accurate measurements in order to conclude 
with general laws based on experiments, and deducing from the so-obtained 
laws, independently from any assumption on the nature of the forces pro- 
ducing the phenomena, the mathematical value of these forces, that is, the 
mathematical formula that describe them. 


In spite of this, Ampére was forced to add a few assumptions when he considered 
the mutual action of two elements of current, mainly because he was unable to 
conduct experiments with infinitely small parts of voltaic circuits. 

By restricting himself to the observations of balance between two photovoltaic 
elements, he thus assumed that the action between two elements of current is accord- 
ing to a force along the straight line that joins them. This is one of the simplest 
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hypothesis used by Newton for describing any force acting between particles: Such 
a force is always acting (Ref. 26, p. 178) 


along the straight line which joins them, in such a way that the action 
produced by one onto the other is equal and opposed to the force simulta- 
neously produced by the latter onto the former. 


In such a case, when these two particles are permanently linked to each other, 
there is no motion resulting from their mutual action. Applying this to two cur- 
rent elements placed at a given distance one from the other, Ampére found that 
“their mutual action depends on the lengths, the intensities of currents, and on their 
respective positions” (Ref. 26, p. 200). The mathematical expression corresponding 
to this feature was then (Ref. 26, p. 204) 


oy 
“ds ds' (cose + h.cos0 cos 6’), (1) 
a 


for any two current elements of lengths ds and ds’, having for intensities i and 7’, 
respectively, and where ¢ is the angle between the two current elements, @ and 6’ 
the angles that present these elements with respect to the direction of their currents 
with the length r of the line joining their centers, and the constant h = k — 1 where 
k represents the action of one element onto the other. He then rewrote Eq. (1) as 
(Ref. 26, p. 207) 


2 
: ds ds’ ds ds’ 7) 


it'ds ds’ ( dr? dr dr ) 
rr 

and showed that the action is equal to the reaction only when n = 2 (Ref. 26, 
p. 232). 

Ampere mainly investigated the phenomena produced by electric current. He 
clearly distinguished these phenomena that he designated as “electrodynamic” phe- 
nomena from those produced by the interaction between a magnet and an electric 
current which were commonly designated as “electromagnetic” phenomena (Ref. 26, 
p. 298). Nevertheless, by these times, the term “electromagnetic” was already used 
for designating phenomena produced by two current elements, Ampére’s terminol- 
ogy was therefore not retained. 

Ampere conceded that more difficult researches should be conducted for inves- 
tigating whether “electrodynamic phenomena” explained in terms of movement of 
the ether could also lead to the same formula. Such a question left open by Ampére 
already suggests a possible analogy, if not more, between, on the one hand, electro- 
magnetic and electrodynamic phenomena and, on the other hand, light propagation 
(Ref. 26, p. 301): 


If it were possible to prove on the basis of this consideration, that the 
reciprocal action of two elements was in fact proportional according to the 
formula that I have described it, then this account of the fundamental fact 
of the entire theory of electrodynamic phenomena would obviously have 
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to be preferred to every other theory; it would, however, require investi- 
gations with which I have had no time to perform myself, neither the still 
more difficult investigations which one would have to undertake in order to 
ascertain whether the opposing explanation, whereby one attributes elec- 
trodynamic phenomena to motions imparted by the electrical currents of 
the ether, could lead to the same formula. 


Unfortunately, Ampére’s law (2) does not explain a certain class of electrodynamic 
phenomena such as the Volta induction discovered by Michael Faraday (1791- 
1867)?” and corresponding to the reciprocal action produced by (i) two electric 
charges, one moving with respect to the other or, (ii) two current elements, one cur- 
rent varying with respect to the other. Looking for a generalized formula explaining 
these latter phenomena, Wilhelm Weber (1804-1891) expected a single law for elec- 
trostatics as well as for electrodynamics.?° In fact, Ampére expected such a single 
formula too but without taking into account the possibility for the two current ele- 
ments to be in a relative movement, nor considering varying currents. From Weber’s 
point of view (Ref. 28, p. 82): 


[Ampére’s] law holds only as a particular law and still requires a definitive 
law with truly general validity applicable to all electrodynamic phenomena 
to replace it. 


In order to do that, Weber considered electric current as made of electric masses: 
“The electrical fluids in the two current elements themselves have in them like 
amounts of positive and negative electricity, which, in each element, are in motion 
in an opposite fashion” (Ref. 28, p. 83). The mutual action between two current 
elements thus results from (Ref. 28, p. 83) 


four reciprocal actions of electrical masses to consider two repulsive between 
the two positive and between the two negative masses in the current ele- 
ment, and two attractive, between the positive mass in the first and the 
negative mass in the second, and between the negative mass in the first 
and the positive mass in the second. 


Weber also investigated situations where the two current elements had various 
orientations, relative velocities and relative accelerations in order to get a general 
formula also describing induction phenomena evidenced by Faraday. Using the elec- 
trostatic system of units, he thus obtained the force with which two charged masses 
act upon another expressed as (Ref. 28, p. 89) 


ee a 
a (1 ap 2a roe |) (3) 


where e and e’ are isolated electrical charges with positive or negative values, r 


is the distance between them, ar is the square of the relative velocity between 


2 . . . . 
the two masses and qt their relative acceleration. The constant a? remained to 


A genesis of special relativity ]-27 


be determined. The relative velocity ss between two electrical masses could be 
positive or negative, depending on whether the two masses are moving away from 
or approaching one another. When the two electrical charges are at rest (& = 0) 
this law reduces to Coulomb’s electrostatic law. 

It was later demonstrated by Henri Poincaré (1854-1912) that in the electro- 
magnetic system of units, this force between two electric charges is expressed as 
(Ref. 29, p. 34): 


it'ds ds! dr? dr dr 
r (4) 
dsds' — ds ds! 
which is in complete agreement with Ampére’s law. Thus, the reciprocal action 
is directly proportional to the current intensities in the two current elements and 
inversely proportional to the square of the distance between them; moreover it can 
distinguish repulsive from attractive forces (Ref. 28, p. 92): 


2 


The force [not only depends on] the magnitude of the masses and their 
distance from one another, but also on their relative velocity and relative 
acceleration. 


Weber finally got “the general law” (Ref. 28, p. 98) 
a (1 a? dr? , ora). (5) 
r 16 dt? 8 dt? 
for the force between two ponderable charges. He was thus able to conclude that this 
force did not only depend on the two ponderable charges but also “on the presence 
of a third body” (Ref. 28, p. 141). For Weber, there is thus a medium transmitting 
the force between the ponderable charges. Moreover, while discussing Faraday’s 


experiments on the influence of electrical currents on light,?° Weber explained that 
it is (Ref. 28, p. 142) 


not improbable that the all-pervasive neutral electrical medium is itself 
that all-pervasive ether, which creates and propagates light vibrations, or 
that at least the two are so intimately interconnected that observations of 
light vibrations may be able to explain the behavior of the neutral electrical 
medium. 


In 1852, Weber replaced the constant a in formula (5) by = and obtained the 
new formulation 
ee! i 1 dr? | ar d’r . (6) 
r2 C2 dt? e dt? 


where é designates Weber’s constant” and is different from the constant c used 
for expressing the light velocity in vacuum. The constant ¢ represents the number 


bIn fact, Weber designated his constant as c but this constant had a different value than the 
one known as the light velocity, we changed the notation to make a distinction between Weber’s 
constant c and c the velocity of light. 
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of electrostatic units in one electromagnetic unit of electricity and its value was 
determined ten years later by Weber and Rudolf Kohlrausch (1809-1858)*1; they 
obtained 

2 2803510 we (7) 
B 
Then, taking into account that for having the same effect, units for electromagnetic 
current i are greater by a factor V2 than those used for electrodynamic current j 


(j = V2i) (Ref. 28, p. 18), it came in fact that 

ox ; = z = 3.10741 -10®m-s~?, (8) 
For Weber, the physical meaning of this constant ¢ was the relative velocity between 
particles with charges e and e’, respectively, to have and to keep a null action to 
each other. 

In 1857, Gustav Kirchhoff (1824-1887) used Weber’s theory to develop a theory 
of electric current in conductors of any form, consisting in fact of a generalization 
of his theory for electric current in linear conductors that he developed in earlier 
papers.??:33 As in the case of linear conductors, Kirchhoff proved** 


that [...] the electricity in the wire progresses like a wave in a taught string 
with the velocity of light in empty space. 


Stimulated by Kirchhoff’s work, Weber came to the conclusion that®° ies is the 
limit towards which converges all propagation velocities and |...| this limit has for 
value a = 310.740-10°m-s~!” (Ref. 35, p. 622), thus confirming Kirchhoff’s 
results. He then suggested a possible connection between light and electromagnetic 
phenomena: 


If this approximated agreement between the propagation speed of electrical 
waves and the light speed could be seen as an indication of a close relation 
between these two doctrines, it would deserve a great interest, because the 
research for such a relation is of a great importance. 


5.2. Mazwell’s electromagnetic waves as light 


The breakthrough was provided by James Clerk Maxwell (1831-1879) who pub- 
lished the first electromagnetic theory*® in 1865, claiming that light could be an 
electromagnetic wave. He started with Weber’s theory which was the most com- 
plete by these times, but he encountered some difficulties in providing a mechanical 
explanation of its functioning due to the assumption that particles were acting at 
a distance with forces depending on their velocities: These mechanical difficulties 
prevented him from considering Weber’s theory “as an ultimate one”. For Maxwell, 
phenomena must be explained (Ref. 36, p. 460) 


by supposing them to be produced by actions which go on in the surround- 
ing medium as well as the excited bodies, and endeavouring to explain 
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the action between distant bodies without assuming the existence of forces 
capable of action directly at sensible distances. 


In fact, Maxwell replaced any action at a distance by the concept of the electro- 
magnetic field as suggested by Faraday. The electromagnetic field was thus “a part 
of space which contains and surrounds bodies in electric or magnetic conditions” 
(Ref. 36, p. 460). It was clear for Maxwell that “in this space, there is matter in 
motion, by which the observed electromagnetic phenomena are produced” (Ref. 36, 
p. 460); this was therefore a “dynamical theory”. 

Following Ampére’s and Weber’s assumptions, Maxwell had (Ref. 36, p. 460) 


therefore some reason to believe from the phenomena of light and heat, that 
there is an aethereal medium filling space and permeating bodies, capable 
of being set in motion and of transmitting that motion from one part to 
another and affects it in various ways. 


Maxwell thus avoided to consider action at a distance by using the ether “filling 
space and permeating bodies” (Ref. 36, p. 460) with the properties which were 
already encountered in the works by Newton and Fresnel; for instance, the ether has 
“small but real density, [is] capable of being set in motion, and of transmitting motion 
from one part to another with great, but not infinite velocity; [it has] certain kind 
of elasticity yielding” (Ref. 36, p. 460). Using the electromagnetic field, Maxwell 
was thus able to explain how two current elements were interacting at a distance 
(Ref. 36, p. 464): 


When an electric current is established in a conducting circuit, the neigh- 
boring part of the field is characterized by certain magnetic properties, and 
that if two circuits are in the field, the magnetic properties of the field due 
to the two currents are combined. Thus each part of the field is in con- 
nexion with both currents, and the two currents are put into connexion 
with each other in virtue of their connexion with the magnetization of the 
field. 


As in Fresnel’s essay, double refraction was a key phenomenon to be explained; 
Maxwell did it by using the electromagnetic field to which it became relevant to link 
polarization. He thus explained Faraday’s experiments in which light polarization 
was affected by a magnetic field as follows (Ref. 36, p. 461): 


The luminiferous medium is in certain cases acted on by magnetism, [as 
evidenced by Faraday’s experiments®? showing that] when a plane polarized 
ray traverses a transparent diamagnetic medium in the direction of the lines 
of magnetic field produced by magnets or currents in the neighborhood, the 
plane of polarization is caused to rotate. 


The ether has therefore magnetic properties. It is subject to motion as well as to 
vibrations: Electromagnetic phenomena were more related to motion and light to 
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vibrations. Maxwell was able to draw some analogy between these two types of 
phenomena (Ref. 36, p. 464): 


It appears therefore that certain phenomena in electricity and magnetism 
lead to the same conclusion as those of optics, namely, that there is an 
aethereal medium pervading all bodies, and modified only in degree by 
their presence; that the parts of this medium are capable of being set 
in motion by electric currents and magnets, that this motion is commu- 
nicated from one part of the medium to another by forces arising from 
the connexions of those parts; that under the action of these forces there 
is a certain yielding depending on the elasticity of these connexions, and 
that therefore energy in two different forms may exist in the medium, the 
one form being the actual energy of motion of its parts, and the other 
being the potential energy stored up in the connexions, in virtue of their 
elasticity. 


Maxwell summed up all existing electromagnetic phenomena into 20 differential 
equations; in doing so, he used a local electromagnetic field in the ether filling the 
space surrounding electric and magnetic bodies. He thus described the mechanical 
actions applied to these bodies. 

Expressed in the electrostatic system of units, the units used for describing 
mechanical actions between electrified bodies must be corrected by a coefficient k 
taking into account the mechanical action between currents in the electromagnetic 
system of units. For Maxwell, the coefficient k represents “the coefficient of electric 
elasticity in the medium in which the experiments are made, 1.e. common air” 
(Ref. 36, pp. 491-492); it is related to v, the number of electrostatic units in one 
electromagnetic unit, by the relation 


k= Any”, (9) 


where v is in fact Weber’s constant c = S that is not so different from the value of 
light velocity. A natural task, for Maxwell, once the ether was required to explain 
light propagation as well as electromagnetic phenomena, was to question (Ref. 36, 
p. 497) 


whether these properties of that which constitutes the electromagnetic field, 
deduced from electromagnetic phenomena alone, are sufficient to explain 
the propagation of light through the same substance. 


Applying his equations governing the electromagnetic field to the propagation 
of a plane wave, Maxwell showed “that the wave is propagated in either direction 
with a velocity” (Ref. 36, p. 498) 


V=+,/— 1 
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where yu is the coefficient of magnetic induction which depends “on the nature of 
the medium, its temperature and the amount of magnetization already produced” 
(Ref. 36, p. 482). Since experiments to determine the value of k were only made in 
air in which p = 1, the velocity of light V in air is equal to 


Bid 
V2 
This value was only slightly different from the light velocity experimentally mea- 
sured by Fizeau (V = 314.858.000m-s~1')?” or the value deduced from the coeffi- 
cient for light aberration (V = 308.000.000m-s~!). Since these two measurements 


did not involve electricity or magnetism, Maxwell was led to conclude (Ref. 36, 
p. 499): 


V =v= — =3.1074-108m-s7t. (11) 


The agreement of the results seems to show that light and magnetism 
are affections of the same substance, and that light is an electromagnetic 
disturbance propagated through the field according to electromagnetic 
laws. 


He also showed with his set of electromagnetic field equations that only transver- 
sal vibrations could be propagated through the medium and, consequently, that 
(Ref. 36, p. 499) 


this wave consists entirely of magnetic disturbances, the direction of mag- 
netization being in the plane of the wave. No magnetic disturbance whose 
direction of magnetization is not in the plane of the wave can be propagated 
as a plane wave at all. 

Hence magnetic disturbances propagated through the electromagnetic 
field agree with light in this, that the disturbance at any point is transverse 
to the direction of propagation, and such waves may have all the properties 
of polarized light. 


Fresnel’s result for light was thus confirmed here. Hence, he concluded that (Ref. 36, 
p. 501) 


electromagnetic science leads to exactly the same conclusions as optical 
science with respect to the direction of the disturbances which can be 
propagated through the field, both affirm the propagation of transverse 
vibrations and both give the same velocity of propagation. On the other 
hand, both sciences are at a loss when called on to affirm or deny the 
existence of normal vibrations. 


The velocity c of light — which emerges from all these theories as a constant — 
was presented by George Stoney (1826-1911) at the Belfast meeting (1874) of the 
British Association for the Advancement of the Science*® as one of three absolute 
quantities, the “velocity of Maxwell” that is also “ the maximum of the velocity of 
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light”, the constant of gravitation and the unit quantity of electricity. For Stoney, 
using the constant c as the unit velocity would provide “an immense simplification 
[...] in our treatment of the whole range of electric phenomena, and probably into 
our study of light and heat” .38 


5.3. Helmholtz’s theory 


When Helmholtz turned his attention in electrodynamics in 1870, the main three 
theories were the potential law of Frantz Neumann (1798-1895),*° the reciprocal 
action between two moving ponderable charges by Weber®° and the theory of the 
electromagnetic field in the ether by Maxwell.°° These theories were not so widely 
accepted for various reasons. Maxwell’s ideas were criticized for their lack of rigor 
and experimental evidences; Weber’s formula did not obey to the principle of energy 
conservation due to its dependence on the relative velocity and on the relative 
acceleration between ponderable charges, a principle that Helmholtz showed to be 
universal.*1 

In order to overcome such a weakness, Helmholtz also proposed an electrody- 
namic theory including the previously existing theories and describing the propaga- 
tion of light as Maxwell did, but in a more comprehensive form. The three theories 
successively developed by Weber, Neumann and Maxwell were then unified in a 
single formula by means of an arbitrary parameter k whose values were —1, +1 
and 0, respectively. Like the previous ones, Helmholtz’s theory was based on the 
existence of an ether. 

From his general law, Helmholtz obtained the differential equations describ- 
ing the motion of electricity. Then, by investigating the nature of his differential 
equations for the three values of the constant k, he concluded that for k = —1 cor- 
responding to Weber’s law, the motion of electric charges was unstable, contrary to 
what he obtained for the two other values of k. He also showed experimentally that 
motions of electricity in conducting bodies were nearly the same as he obtained 
with his equations for k = +1 andk =0. 

Nevertheless, comparing his theory to Maxwell’s, he concluded that*? 
the two theories are opposed to each other in a certain sense since according 
to the theory of magnetic induction originating with Poisson, which can be 
carried through in a fully corresponding way for the theory of dielectric, 
polarization of insulators, the action at a distance is diminished by the 
polarization, while according to Maxwell’s theory on the other hand, the 
action at a distance is exactly replaced by the polarization [...]. It follows 
[...] from these investigations that the remarkable analogy between the 
motion of electricity in a dielectric and that of the light aether does not 
depend on the particular form of Maxwell’s hypotheses, but results also 
on a basically similar fashion if one maintains the older view point about 
electrical action at a distance. 
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Although Maxwell’s theory was later published in a slightly different form,‘ but 
without removing its internal contradiction, it was preferred to Helmholtz’s theory 
as history has shown. 


5.4. Hertz’s experiments for validating Maxwell's theory 


Maxwell left his theory without any experimental proof. Such an experimental evi- 
dence was provided by Heinrich Hertz (1857-1894)*+ 46 while addressing the ques- 
tion related to the “Berlin Academy of Science Prize” proposed by Helmholtz for 
the year of 1879: “To establish experimentally any relation between electromagnetic 
forces and the dielectric polarization of insulators” (Ref. 47, p. 1). Hertz’s researches 
were thus guided by the connection propounded by Helmholtz, according to which 
(Ref. 47, p. 6): 


If we start from the electromagnetic laws which in 1879 enjoyed universal 
recognition, and make certain further assumptions, we arrive at the equa- 
tions of Maxwell’s theory which at that time (in Germany) were by no 
means universally recognized. 


For Hertz, the question was to experimentally investigate the validity of Maxwell’s 
conclusion that light could be an electromagnetic wave. Hertz was initially con- 
vinced that such a claim was unacceptable (Ref. 47, p. 8): 


I reflected that it would be quite as important to find out that electric 
force was propagated with an infinite velocity, and that Maxwell’s theory 
was false, as it would be, on the other hand, to prove that this theory 
was correct, provided only that the result arrived at should be definite and 
certain. [...] 

I have entered into these details here in order that the reader may be 
convinced that my desire has not been simply to establish a preconceived 
idea in the most convenient way by a suitable interpretation of the experi- 
ments. On the contrary, I have carried out with the greatest possible care 
these experiments (by no means easy ones) although they were in opposi- 
tion to my preconceived views. 


Hertz’s experiments were as follows (Ref. 45, p. 107): 


In the first place, regular progressive waves were to be produced in a 
straight, stretched wire by means of corresponding rapid oscillations of a 
primary conductor. Next, a secondary conductor was to be exposed simul- 
taneously to the influence of the waves propagated throughout the wire and 
to the direct action of the primary conductor propagated through the air; 
and thus both actions were to be made to interfere. Finally, such interfer- 
ences were to be produced at different distances from the primary circuit, 
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so as to find out whether the oscillations of the electric force at great dis- 
tances would or would not exhibit a retardation of phase, as compared with 
the oscillations in the neighborhood of the primary circuit. 


Hertz was thus investigating the influence of primary oscillations on a secondary 
circuit, taking into account their relative positions, a possibly relevant aspect for 
testing the electromagnetic effect independently of the electrostatic effect. From his 
experiments, he found “that the waves in the wire have the same periodic time as 
the primary oscillations” (Ref. 45, p. 111). Using interference phenomena resulting 
from the interaction between the waves propagated in air with the waves propagated 
in a wire, he observed that (Ref. 45, p. 117): 


Very little consideration will show that, if the action is propagated through 
the air with infinite velocity, it must interfere with the waves in the wire 
in opposite senses at distances of half a wave-length (i.e. 2.8 meters) along 
the wire. Again, if the action is propagated through the air with the same 
velocity as that of the waves in the wire, the two will interfere in the same 
way at all distances. Lastly, if the action is propagated through the air 
with a velocity which is finite, but different from that of the waves in the 
wire, the nature of the interference will alternate, but at distance which 
are farther than 2.8 meters apart. 


Hertz was finally able to assess the velocity of electromagnetic waves (Ref. 45, 
p. 107): 


The experiments carried out in accordance with it have shown that the 
inductive action is undoubtedly propagated with a finite velocity. This 
velocity is greater than the velocity of propagation of electric waves in 
wires. According to the experiments made up to the present time, the ratio 
of these velocities is about 45:28. From this it follows that the absolute 
value of the first of these is of the same order as the velocity of light. 


and concluded that “the absolute velocity of propagation in air is 320.000 km per 
second” (Ref. 45, p. 121). 

Although he had experimentally proved that electromagnetic actions are prop- 
agated through air with a finite velocity, Hertz wanted to propose an experiment 
which could “exhibit the propagation of induction through the air by wave-motion 
in a visible and almost tangible form” that would permit “a direct measurement of 
the wave-length in air” (Ref. 46, p. 124). This new experiment was based on the 
interference between direct waves propagating in air and waves which were reflected 
by a wall, producing stationary waves. By repeating this interference experiment 
between two waves for various distances from the reflecting wall, Hertz showed that 
the velocity had the same order of magnitude as the velocity associated with light 
propagation. He thus concluded that (Ref. 46, p. 136) 
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the experiments amount to so many reasons in favor of that theory of 
electromagnetic phenomena which was first developed by Maxwell from 
Faraday’s views. It also appears to me that the hypothesis as to the nature 
of light which is connected with that theory now forces itself upon the 
mind with still stronger reason than heretofore. Certainly it is a fascinating 
idea that the processes in air which we have been investigating represent 
to us on a million-fold larger scale the same processes which go on in 
the neighborhood of a Fresnel mirror or between the glass plates used for 
exhibiting Newton’s rings. 


The result concerning the period of oscillations of electromagnetic waves in air 
obtained by Hertz presented a computational error by V2 : 1, Poincaré first drew 
Hertz’s attention to this.47 49 

Hertz interpreted all the results provided by his rapid oscillations experiments 
from the standpoint associated with Helmholtz’s theory, that is, from the fact that 
there are “two forms of electric force — the electromagnetic and the electrostatic — 
to which |...] two different velocities are attributed” (Ref. 47, p. 15). Moreover, in a 
special limiting case, Helmholtz’s equations became “the same as those of Maxwell's 
theory: only one form of the force remains, and this is propagated with the velocity 
of light” (Ref. 47, p. 15). Hertz’s experimental results conferred “upon Maxwell’ s 
theory a position of superiority to all others” (Ref. 50, p. 137). Hertz was thus 
able “to show that the phenomena can be explained in terms of Maxwells theory 
without introducing” (Ref. 50, p. 137) a distinction between electromagnetic and 
electrostatic forces, as was done by Helmholtz. 

In order to describe electric and magnetic forces acting in the ether at points x 
which, according to Maxwell, are such that “the time-rate of change of the forces 
is independent upon their distribution in space” (Ref. 50, p. 138), Hertz° used the 
equations (Ref. 50, p. 138) 


1dH 

ieee E 12 

a dt VA (12) 
and 

1 dE 

Beare es H 1 

~~ =-(V AH), (13) 


“We choose to translate the equations in the modern vectorial notations — introduced by Oliver 
Heaviside®! — rather than leaving the quite cumbersome use of different letters for the components 
of a given vector. Apart from that, we left unchanged the way in which the equations were written 
in the original paper. For instance, Hertz introduce a minus sign in Eq. (13) and not in Eq. (12) 
as they are now written. These equations, describing the electric and magnetic fields in the ether, 
were rewritten with the sign as used today by Poincaré (Ref. 71, p. 373) that is, with a minus sign 
in Eq. (12) (and not in Eq. (13)) by Poincaré (Ref. 29, pp. 115-116). The corresponding equations 
for electromagnetic phenomena in bodies in motion were also rewritten with the corrected sign by 
Lorentz (Ref. 52, pp. 55 and 58). 
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where E is the electric force (whose components were designated by (X,Y, Z) by 
Hertz), H the magnetic force (whose components were designated by (L, M,N) by 
Hertz); ¢ is the time and + is the reciprocal of the light velocity. 

For Hertz, + is “in reality an intrinsic constant of the ether; in saying this we 
assert that its magnitude is independent of the presence of any other body, or of 
any arbitrary stipulation on our part” (Ref. 53, p. 202). To these equations, Hertz 
added two specific conditions (Ref. 50, p. 138) 


V-E=0 (14) 
and 
V -H=0O, (15) 


which must be satisfied and which are required to distinguish the ether from the 
ponderable matter. 

The total energy contained in a volume-element 7 of the ether was the sum of 
the electric energy (Ref. 50, p. 138) 


1 

— | E*d 1 

87 ui ine) 
and the magnetic energy 

ae / Hd (17) 

87 ms 


contained in that volume-element. When electric and magnetic forces act on the 
ether, Hertz imposed p = 1 and € = 1, as he specified: “The specific inductive capac- 
ity (Dielecktricitdétsconstante) and the magnetic permeability (Magnetisirungscon- 
stante) of a susbstance [are] equal to unity for the ether; but this does not state 
any fact derived from experience; it is only an arbitrary stipulation on our parts.” 
(Ref. 53, p. 200) 

Concerning all these equations, Hertz wrote that (Ref. 50, p. 138): 


These statements form, as far as the ether is concerned, the essential parts 
of Maxwell’s theory. Maxwell arrived at them by starting with the idea of 
action-at-a-distance and attributing to the ether the properties of a highly 
polarisable dielectric medium. 


Hertz assumed that it was possible to reach the same statements by using other ways 
“but in no way can a direct proof of these equations be deduced from experience” 
(Ref. 50, p. 138). He was thus able to show that his experiments were in agreement 
with “these much simpler assumptions of Maxwell's theory” (Ref. 50, p. 159): 


In our endeavour to explain the observations by means of Maxwell’s theory, 
we have not succeeded in removing all difficulties. Nevertheless, the theory 
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had been found to account most satisfactorily for the majority of the phe- 
nomena; and it will be acknowledged that this is no mean performance. But 
if we try to adapt any of the older theories to the phenomena, we meet with 
inconsistencies from the very start, unless we reconcile these theories with 
Maxwell’s by introducing the ether as dielectric on the manner indicated 
by Helmholtz. 


Nevertheless, as many other scientists, Hertz had “been compelled to abandon the 
hope of forming for |himself| an altogether consistent conception of Maxwell's ideas” 
(Ref. 47, p. 29), thus justifying why his experiments were guided by Helmholtz’s 
theory and not by Maxwell’s. Unfortunately, his experiments led him to pro- 
mote Maxwell’s theory, leaving him with a lack of consistency in his methodol- 
ogy. Consequently, Hertz wanted “to form for [himself] in a consistent manner 
the necessary physical conceptions starting from Maxwells equations, but otherwise 
simplifying Maxwells theory as far as possible” (Ref. 47, p. 21). In fact, Hertz con- 
sidered that “Maxwell's theory is Maxwell's system of equations” and added that 
(Ref. 47, p. 21) 


every theory which leads to the same system of equations, and therefore 
comprises the same possible phenomena, I would consider as being a form 
or special case of Maxwell’s theory; every theory which leads to differ- 
ent equations, and therefore to different possible phenomena, is a different 
theory. 


6. Invariance of the Field Equations from a Frame to Another One 
6.1. Hertz’s electrodynamic theory 


The electromagnetic theory developed by Hertz was published in two papers, one 
for bodies at rest (Ref. 53, p. 195) and one for bodies in motion (Ref. 54, p. 241). 
From Maxwell’s theory, Hertz kept only the equations he corrected according to his 
experiments and wanted to build a rigorous theory leading to them. 

Although informed about similar work®° led by Oliver Heaviside (1850-1925) 
whose results are identical to those obtained by Hertz, the latter remained convinced 
that his theory had to be preferred to all the others because it was describing 
accurately more phenomena. Hertz was working from the experimental fact to the 
equations describing them, using a methodology claimed by Newton or Ampére: 
“The statement will be rather given as facts derived from experience, and experience 
must be regarded as their proof” (Ref. 53, p. 197). For determining electric and 
magnetic forces as well as the total energy at a given point of the space, Hertz 
made “an essential and important hypothesis” which was (Ref. 53, p. 198) 


that the specification of a single directed magnitude is sufficient to deter- 
mine completely the change of state under consideration. Certain phenom- 
ena, e.g, those of permanent magnetism, dispersion, etc., are not intelligible 
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from this standpoint; they require that the electric or magnetic conditions 
of any point should be represented by more than one variable. 


Considering his fundamental Eqs. (12)—(15) that he introduced in one of his previous 
papers,°° Hertz proposed the equations governing electromagnetics phenomena — 
according to his experiments — in bodies of various characteristics (good or bad 
conductor, isotropic or crystalline medium), considering that phenomena quanti- 
tatively differ from those observed in the free ether “in two respects: In the first 
place, the intrinsic constant has a value different from what it has in the ether; 
and in the second place, the expression for the energy per unit volume contains, as 
already explained, the constants € and 1” (Ref. 53, p. 202). As an example, the set 
of equations describing electromagnetic phenomena in a conducting isotropic and 
homogeneous medium are (Ref. 53, p. 205) 


-—l— =VAE (18) 
and 
-e— =-VAH-— —_E (19) 


where p is the magnetic permeability of the medium; € its specific inductive capacity 
and A is the specific conductibility of the body. Consequently, the total energy of 
the electromagnetic field per unit of volume is (Ref. 53, p. 199) 


€ 72, F 72 

ane + ant : (20) 
As it must be, the set of equations proposed by Hertz and describing electromagnetic 
phenomena in bodies at rest is consistent with the principle of the conservation of 
energy. All electromagnetic phenomena as well as some optical phenomena such 
as reflection or refraction are well described by these equations. Nevertheless, they 
cannot provide an explanation of the dispersion phenomena which need more than 
one oriented magnitude as used in this theory. 

Developing further his first paper for bodies at rest, Hertz extended his the- 
ory “to embrace the course of electromagnetic phenomena in bodies which are in 
motion” (Ref. 54, p. 241). By bodies in motion, Hertz considered ponderable mat- 
ter and “the disturbances of the ether which simultaneously arise [and] cannot be 
without effect; and of these we have no knowledge” (Ref. 54, p. 241). In order to do 
that, he needed to add an assumption on the motion of the ether: In a ponderable 
body, there is a total driving of the ether. Hertz thus assumed that (Ref. 54, p. 242): 


Electric and magnetic phenomena must be compatible with the view that 
no [partial driving] occurs, but that the ether which is hypothetically 
assumed to exist in the interior of ponderable matter only moves with 
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it. This view includes the possibility of taking into consideration at every 
point in space the condition of only one medium filling the space. 


Here, Hertz proposed to fuse into a single medium the transparent body traversed 
by light and the ether that should exist in it; at least in a ponderable body, the 
ether could thus be omitted for explaining light propagation. Such an assumption 
was mainly motivated by the fact that, according to Hertz, he has no experimental 
data related to a partial driving. 

He admitted that the theory he developed (Ref. 54, pp. 242-243) 


on such a foundation will not possess the advantage of giving to every 
question that may be raised the correct answer, or even of giving only one 
definite answer; but it at least gives a possible answer to every question 
that may be propounded, i.e. answers which are not inconsistent with the 
observed phenomena nor yet with the views which we have obtained as to 
bodies at rest. 


Hertz then considered that each point of a body is characterized by the electric 
force E and the magnetic force H, the electric polarization P and the magnetic 
polarization M, the electric current I, the electromotive force E’, the magnetic and 
dielectric constants y and e¢, respectively, and the conductibility constant AX. 

In his previous essay,°* Hertz had obtained the following system of equations to 
describe the electromagnetic phenomena in bodies at rest according to electric and 
magnetic polarizations (Ref. 53, p. 211) 


-— =VAE (21) 
and 
-— =-VAH-—IL (22) 


By introducing the polarizations, the electromagnetic energy per unit volume of 
any body takes the form: 


*p.p44M-H. (23) 

87 87 
In these expressions there no longer appear any quantities which refer to 
any particular body. The statement that these equations must be satisfied 
at all points of infinite space, embraces all problems of electromagnetism; 
and the infinite multiplicity of these problems only arises through the fact 
that the constants €, 41, A, E’ of the linear relations, may be functions of 
the space in a multiplicity of ways, varying partly continuously, and partly 
discontinuously, from point to point. (Ref. 53, p. 211) 
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As in his first paper, electric and magnetic polarizations must be “regarded as a 
second and equivalent means of indicating the same conditions” (Ref. 54, p. 243) 
as electric and magnetic forces. 

To transpose his fundamental equations for bodies at rest to the case of bodies 
in motion, Hertz remarked that (Ref. 54, pp. 243-244) 


At any point of a body at rest the time-variation of the magnetic state 
is determined simply by the distribution of the electric force in the neigh- 
bourhood of the point. In the case of a body in motion there is, in addition 
to this, a second variation which at every instant is superposed upon the 
first and which arises from the distortion which the neighbourhood of the 
point under consideration experiences through the motion. 


Hertz also introduced a relevant hypothesis according which “the influence of the 
motion is of such a kind that, if it alone were at work, it would carry the magnetic 
lines of force with the matter” (Ref. 54, p. 244). 


The corresponding statement holds good for the variation which the electric 
polarization experiences through the motion. These statements suffice for 
extending to moving bodies the theory already developed for bodies at rest; 
they clearly satisfy the conditions which our system of itself requires, and 
it will be shown that they embrace all the observed facts. 


Under these considerations, Hertz obtained his fundamental electromagnetic equa- 
tions for bodies in motion (Ref. 54, p. 245): 


2 [Ge TV AMMAN) +¥-(V-M)| =VAE, (24) 
a[GtVA@av+v-(W-P)| =-vAH- Sr, (25) 


where V is the velocity with which the surface element was moving. These equations 
“are completed by linear relations which connect the polarizations and the current- 
components with the forces. The constants of these relations are to be regarded as 
functions of the varying conditions of the moving matter, and to this extent as 
functions of the time as well” (Ref. 54, p. 246). Hertz also provided some details 
about the reference frame used for writing these equations: 


Our method of deducing the [previous] equations does not require that the 
system of co-ordinates used should remain absolutely fixed in space. We 
can, therefore, without change of form, transform our equations from the 
system of co-ordinates first chosen to a system of co-ordinates moving in 
any manner through space, by taking V;, V,, Vz to represent the velocity- 
components with reference to the new system of co-ordinates, and referring 
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the constants ¢, y1, A, E’, which depend upon direction, at every instant 
to these. From this it follows that the absolute motion of a rigid system of 
bodies has no effect upon any internal electromagnetic processes whatever 
in it, provided that all the bodies under consideration, including the ether 
as well, actually share the motion. It further follows from this consideration 
that even if only a single part of a moving system moves as a rigid body, 
the processes which occur in this part follow exactly the same course as in 
bodies at rest. 


[The previous] equations tell us the future value of the polarizations at every 
fixed point in space or, if we prefer it, in each element of the moving matter, 
as a definite and determinate consequence of the present electromagnetic 
state and the present motion in the neighbourhood of the point under 
consideration (Ref. 54, p. 247). 


Although Hertz asserted the invariance of his equations from a system of coor- 
dinates at rest to a system of coordinates in motion (with no restriction on the type 
of relative motion), he did not propose the coordinate transformation allowing such 
invariance which was not therefore proved. 

As he did for bodies at rest, Hertz showed that the fundamental electromagnetic 
equations for bodies in motion are consistent with the principle of the conservation 
of energy as well as the principle of action and reaction, as they must be. He then 
concluded (Ref. 54, p. 268): 


T only attach value to the theory of electromagnetic forces in moving bod- 
ies here proposed from the point of view of systematic arrangement. The 
theory shows how we can treat completely the electromagnetic phenomena 
in moving bodies, under certain restrictions which we arbitrarily impose. 
It is scarcely probable that these restrictions correspond to the actual facts 
of the case. The correct theory should rather distinguish between the con- 
ditions of the ether at every point, and those of the embedded matter. But 
it seems to me that, in order to propound a theory in accordance with this 
view at present, we should require to make more numerous and arbitrary 
hypotheses than those of the theory here set forth. 


Thus, even if the theory developed by Hertz for bodies in motion is consistent 
with the principles of action and reaction and of the conservation of the energy, it 
cannot explain certain optical phenomena. This lack of generality was addressed by 
Hendrick A. Lorentz (1853-1928) in 1892 when he investigated the electrodynamics 


of bodies in motion.®? 


6.2. Voigt’s wave equation 


The first coordinate transformation leaving invariant the system of electromagnetic 
equations was proposed by Woldemar Voigt (1850-1919). In order to explain the 
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so-called Doppler effect, Voigt started to investigate the differential equation gov- 
erning the propagation of a plane wave in an elastic medium, that is, a phenomenon 
equivalent to light propagation in an ether.°°” Assuming that plane waves were 
moving with a velocity c and a constant amplitude, Voigt was looking for a change 
of coordinates that would leave unchanged his wave equation when he switched from 
one illuminating surface to another one in a uniform translation by a velocity V 
with respect to the first one. Voigt had to introduce the coordinate transformation 
(Ref. 56, p. 50) 


gv =ax-—Vt 

y= 

Z=%z i (26) 
ii V-2 
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where V (designated by y by Voigt) is the uniform velocity of the moving axes 
(x’,y’, 2’) along the z-axis, y = ,/1— \S (y was designated as q by Voigt) and c 
(w in Voigt’s notation) is the propagation velocity of the oscillations (or the plane 
waves). Voigt thus needed to introduce a local time t’ to obtain the invariance of 
his wave equation when terms _ were neglected (Ref. 56, p. 50): 


[The two illuminating surfaces, one at rest and one in motion at velocity V] 
have identical forms only if y = 1, i.e. V is small against c, that V? can be 
neglected with respect to c?. 


While considering a small illuminated sphere of radius r, he thus concluded that 
(Ref. 56, p. 50) 


a stationary observer, since the perpendicular to the wave surface through 
the location of observation gives the direction in which the light source is to 
be perceived, would see the illuminating point at the location where it was 
at time £; in other words, he would observe, if his radius vector r includes 
the angle @ with the direction of motion, an “aberration” of the size + sin d 


in the direction opposite to the motion of the point. 


The aberration could be seen as a length contraction as it was later introduced by 
FitzGerald and Lorentz (see next section). 


6.3. Lorentz’s electrodynamical theory 


When Lorentz started to investigate electrodynamical theories, he was mainly 
focused on the foundations used by Hertz for developing his electromagnetic theories 
for bodies at rest and in motion. Lorentz considered that, to establish his funda- 
mental equations, Hertz “doesn’t take care of a link between the electromagnetics 
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actions and the laws of the wave mechanic” (Ref. 52, p. 6). He clarified his approach 
as follows (Ref. 52, p. 6): 


We are always tempted to return to the mechanical explanations. That’s 
why it seemed to me useful to apply directly to the most general case 
the method of which Maxwell gave the example in his study of the linear 
circuits. 

I had another reason to endeavor these researches. In the report where 
Hertz treated bodies in motion, he admits that the ether which they contain 
moves with them. But, optical phenomena demonstrated for a long time 
that it is not there still so. I thus wished to know the laws which govern the 
electric movements in bodies which cross the ether without entraining it, 
and it seemed to me difficult to reach the purpose without having for guide 
a theoretical idea. The sights of Maxwell can be of use for the foundation 
of the theory we are looking for. 


The optical phenomena not considered by Hertz’s assumption were the experiments 
conducted by Fizeau®® (see Appendix A.1) and those by Michelson and Morley®? 
(see Appendix A.2). When he developed his theory, Lorentz had already in mind the 
latter experiment. He thus developed a “theory of electromagnetic phenomena based 
on the idea that there is a ponderable matter perfectly permeable to the ether and 
able to move without transferring to this latter the smallest movement” (Ref. 52, 
p. 70). 

Starting from what we call now the Maxwell equations to describe the motion of 
an electrified body in an ether partly entrained and being the source of an electric 
current and a dielectric phenomenon, Lorentz added a few hypotheses: 


(i) The electrified body and the resting ether are independent, although the ether 
can freely go across the body. 

(ii) The electrified body is considered as a rigid body whose movements are limited 
to a translation and a rotation. 

(iii) The position of any point taking part in the electromagnetic motion is deter- 
mined as soon as one knows the position of all the charged particles of the 
system and the components of the dielectric displacement in the ether at every 
point. 


To develop his theory, Lorentz needed to use two sets of equations, one related to 
the state of the ether and the second related to the reaction of this medium on the 
electrified particles. 

The set of equations related to a system of charged particles moving in the ether 
but without entraining it was (Ref. 52, pp. 89-90) 


V-D=p, (27) 


where D is the dielectric displacement in the ether and p the density of the electric 
charges. We used here the vector notation as introduced by Heaviside®! (Lorentz 
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only used the operators V and Q). A similar equation was provided for the magnetic 
force H in the ether, that is 


V-H=0. (28) 
The last two equations were 
V AH = Ant = tn(ov + 2), (29) 
where I the electric current and v the velocity of a point of a charged particle; 
—4nc?(V AD) = - (30) 


where c is the light velocity in the ether. These equations correspond to a slightly 
modified form of Eqs. (12)-(15) proposed by Hertz. Their use was justified by 
Lorentz as follows (Ref. 52, pp. 90-91): 


As far as, in the field considered, there is no charged body, the formula 
given by Hertz in his first memoir are the simplest that one can admit to 
express the state of the ether. 


For Lorentz, light propagation consists in (Ref. 52, p. 112) 


oscillatory movements that the charged particles could perform in the 
molecules of a dielectric. Accompanied with periodic changes in the state of 
the ether, these vibrations will constitute a beam of light, the propagation 
of which I suggest to study. 


The previously established equations “will serve to determine the state of the ether 
which is compatible with the movement of particles” (Ref. 52, p. 112). 

To establish the fundamental equations describing the propagation of light in 
bodies in motion, Lorentz needs to introduce two different sets of coordinates. 
Supposing that (Ref. 52, p. 136) 


all molecules of the dielectric are animated with the same velocity of trans- 
lation parallel to the x-axis and independent of the time. I will designate 
by V this velocity and, keeping for now motionless axes O;, O, and O,, I 
will introduce new axes which are fixedly linked to the ponderable matter. 
The first of these axes will match with O,; the two others will be parallel 
to Oy, and O, and will match with these axes at time t = 0. 


The coordinate transformation from a set of motionless axes to a set of axes in 
uniform translation at a velocity of V is then given by (Ref. 52, p. 136): 
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where (2’,y’,2’) are the coordinates in uniform translation and («,y,z) are the 
coordinates at rest. This transformation is completed by the partial derivatives 
expressed as (Ref. 52, p. 136) 


v=¥V" 
a_@ 12 i?) 
ob Of as” 
allowing Lorentz to rewrite his fundamental equations in the form (Ref. 52, p. 137) 
V-H=0, (34) 
) 
VAH=4r |p(v+V)+4+ ap VY D}, (35) 


where V = (V,0,0) “does no longer designate the absolute velocity of a charged 
particle, but is the velocity with respect to the ponderable matter, so that the absolute 
velocity becomes (vz + V,vy, uz)” (Ref. 52, p. 137). The last equation is 


—4r?V AD = & —V- v) H. (36) 


As long as the charged particles have no other motion than the common 
velocity of the ponderable matter, we will have v = 0 and the density at 
a point x will be independent of ¢. This will be no longer like this when 
molecules are the place of electric vibrations. (Ref. 52, p. 137) 


With his new electromagnetic theory of light, Lorentz searched to explain the 
negative result provided by Michelson and Morley’s experiment (see Appendix A.2), 
trusting Fresnel’s theory for a partly driven ether. Lorentz computed, using the 
classical composition law for velocities and Fresnel’s partial driving, that for this 
experiment “the time required by light to travel forth and back between [the] two 
points regarded as fixed to Earth” (Ref. 60, p. 1) should give a fringe shift of wv 
where / is the distance between the two points, V is the velocity of the Earth and 
c is the light velocity. Unfortunately Michelson and Morley’s experiment did not 
show any displacement of the fringes. Lorentz finally reached independently the 
same conclusion as FitzGerald (Ref. 60, p. 2) (see Appendix A.2): 


I found only one way to reconcile [Michelson’s] result with Fresnel’s theory. 
It consists in the assumption, that the line joining two points of a solid 
body does not conserve its length, when it is once in motion parallel to the 
direction of the Earth motion, and afterwards it is brought normal to it. 


Lorentz quantified this contraction according to the coefficient 


a= =>. (37) 
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He physically justified such a contraction as follows (Ref. 60, p. 2): 


What determines the size and shape of a solid body? Apparently the inten- 
sity of molecular forces; any cause that could modify it, could modify the 
shape and size as well. Now we can assume at present, that electric and 
magnetic forces act by intervention of the aether. It is not unnatural to 
assume the same for molecular forces, but then it can make a difference, 
whether the connecting line of two particles, which move together through 
the ether, is moving parallel to the direction of motion or perpendicular 
to it. One can easily see, that an effect of order v is not expected, but an 


effect of order ve is not excluded and that is exactly what we need. 


In 1895, he improved his 1892 theory. Keeping the two different sets of coordi- 
nates introduced in his first theory — one set of fixed coordinates and one moving set 
of coordinates “rigidly connected with ponderable matter and therefore its displace- 
ment” (Ref. 61, Sec. 19) — he then proposed the new coordinate transformation 
(Ref. 61, Sec. 23) 


vw’ = kx 
y=y, (38) 
2 = 2 


1 


where k = ; 
1-62 


with @ = L as the contraction coefficient. The location of a 


point with respect to the fixed system was designated by (x,y,z) and the location 
of that point in the relative coordinates was designated by (2’, y’, 2’). Lorentz was 
thus able to show that the new electric density was (Ref. 61, Sec. 23) 


p =pV1— 6. (39) 


When he tried to express the magnetic force H in the resting frame, the resulting 
expression suggested to him the introduction of a new time (Ref. 61, Sec. 31) 


Vit + Vyy + Vzz 


ri 
—e 2 


(40) 


which can be regarded as a “local time” (Ref. 61, Sec. 31), in contrast to the “general 
time”. Lorentz was looking for an invariance of the electric force E, the magnetic 
force H and the dielectric polarization P, that he formulated as follows (Ref. 61, 
Sec. 59): 


Namely, if a state of motion for a system of stationary bodies is known, 
where P, E, H are certain functions of x, y, z and t, then in the same 
system, if it is displaced by the velocity V, there can exist a state motion, 
where P’, E’, H’ are exactly the same functions of x’, y’, z’ and t’, 


where t’ is the local time given by Eq. (40). 


A genesis of special relativity 1-47 
Using the coordinate transformations (38)—(40), Lorentz was able to conclude 
that (Ref. 61, Sec. 64) 


in general, the motion of Earth will never have an influence of the first 
order on experiments with terrestrial light sources. 


In 1899, Lorentz modified the coordinate transformations (38) and (40) into 
(Ref. 62, p. 429) 


i. Bx 

Pt RB 

Pn x 

 /1-P (41) 
y =y 

z'=2. 


But this was not yet satisfactory and Lorentz obtained the invariance of the fields 
only when he neglected the terms depending on (4) = B?. The imperfection of 
Lorentz’s results were pointed out by Poincaré when he wrote for the celebration 
of the 25th anniversary of Lorentz’s Ph.D. thesis.° 

Stimulated by Poincaré’s criticisms, Lorentz developed a third theory for 
attempting to overcome the difficulties so pointed out (Ref. 64, p. 18): 


Poincaré has objected to the existing theory of electric and optical phenom- 
ena in moving bodies that, in order to explain Michelson’s negative result, 
the introduction of a new hypothesis has been required, and that the same 
necessity may occur each time new facts will be brought to light. Surely, 
this course of inventing special hypotheses for each new experimental result 
is somewhat artificial. It would be more satisfactory if it were possible to 
show, by means of certain fundamental assumptions, and without neglect- 
ing terms of one order of magnitude or another, that many electromagnetic 
actions are entirely independent of the motion of the system. Some years 
ago, I have already sought to frame a theory of this kind (Ref. 62, p. 507). 
I believe now to be able to treat the subject with a better result. The only 
restriction as regards the velocity will be that it be less than that of light. 


Lorentz therefore proposed an additional correction to the change of coordinates 
(41) allowing for switching from a resting frame to a frame moving with a constant 
velocity V along the z-axis; he thus proposed (Ref. 64, p. 812): 


l V 
x’ = kle ; (42) 
y =ly 
2 = 12 
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where t’ is the local time in the moving frame, | is a numerical quantity to determine, 


k= 1 and 6 = _ is the contraction coefficient along the direction of the 


4/1—B2 


motion. With this new transformation, the electric density became (Ref. 64, p. 813) 


1 
HP te 
p = Fa5P- (43) 


Applying this new transformation to the electrons, Lorentz supposed that (Ref. 64, 
p. 818) 


the electrons, which I take to be spheres of radius R in the state of rest, 
have their dimensions changed by the effect of a translation, the dimensions 
in the direction of motion becoming k/ times and thus in perpendicular 
directions / times smaller. In this deformation, which may be represented 
by (4. + +), each element of volume is understood to preserve its charge. 


Concerning the value of J, the single condition which must be taken into account is 
(Ref. 64, p. 823) 


= 1. 4A 
WV l (44) 
Since 
d(kV) 3 
TS = 8%, (45) 
this condition can be rewritten as 
dl 
eet 4 
dV ee) 


and, consequently, ! must be a constant. Lorentz then concluded that “the value of 
the constant must be unity, because we know already that, for V = 0,1 = 1” (Ref. 64, 
p. 824). Nevertheless, the coordinate transformation (42) was not yet fully correct 
as expressed by Poincaré who sent in two letters to Lorentz the correct form (May 
1905, see Refs. 65 and 66) 


t= kl(t+ ex) 


a’ = kl (a + et) (47) 
y' =ly 
z= 12 


where —e is the velocity of the translation, the light velocity being taken equal to 
the unity. As we will see in Sec. 7, Poincaré proposed a more rigorous way to show 
that [= 1,5 

The second theory of Lorentz is based on a few fundamental hypotheses; among 
them we have: 


e There is a contraction of dimensions in the direction of motion by a coefficient 
kl (when | = 1, this coefficient is in agreement with Kaufmann’s experimental 
results). 
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e The forces observed between uncharged particles as well as between electrified 
particles are influenced by a translation in quite the same way as the electric 
force in an electrostatic system. 


One of the very important conclusion by Lorentz is that®*: 


It will be impossible to detect an influence of the Earth’s motion in any 
optical experiments, made with a terrestrial source of light [...]. Many 
experiments on interference and diffraction belong to this class. 


The Michelson and Morley experiment was thus explained (at a given order) with 
this theory. Lorentz was not able to prove at any order the invariance of Maxwell’s 
equations. Nevertheless, he was also able to express the function (3) introduced by 
Max Abraham (1875-1922) who proposed an equation expressing the dependence 
of the mass m on the velocity V,®" that is, 


e e4 1 V 

— =—=—, here G = — 48 

m  mo3W(B) " e c 8) 
and mp is the mass at rest. At that time, the electric charge e of the electron was 
not yet determined with a great accuracy but, according to Stoney,®® it was mostly 
considered as being constant. This equation was experimentally tested by Walter 
Kaufmann (1871-1947).°° Lorentz proposed to use 


4 1 


w =. 49 
=F (49) 
When introduced in Eq. (48), one gets 
4 1 
ae = 4 = Be, (50) 
m mo34 1-52 =™o 
3 
that is, 
eg (51) 


Vi-F 
when the electric charge e is kept constant. Although he stated that the non- 
electromagnetic forces should be affected in the same manner as the electromagnetic 
forces, it was not clear for Lorentz whether the mass, in the “classical” sense, would 
vary as observed in Kaufmann’s experiments. For this reason, Lorentz introduced 
the concept of “electromagnetic mass”, with a “longitudinal” and a “transverse” 
components. He thus supposed that “the masses of all particles are influenced by 
a translation to the same degree as the electromagnetic masses of the electrons”. 


Lorentz was not so conclusive because: 


What we know about the nature of electrons is very little and the only 
means of pushing our way farther will be to test such hypotheses as I have 
here made. 
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6.4. Larmor’s theory 


Another theory was proposed by Joseph Larmor (1857-1942). It was based on Neu- 
mann’s theory according to which the velocity of the ether is provided in amplitude 
and direction by the magnetic force rather than the electric force as in Fresnel’s the- 
ory.® Investigating the propagation of a radiation in a material medium — the free 
ether — moving by a velocity V along the x-axis, “whose dynamical equations have 
been definitely ascertained in quite independent ways from consideration of both the 
optical side and the electrodynamic side of its activity” (Ref. 70, p. 161), Larmor 
introduced in 1897 a change of coordinates (Ref. 69, p. 299) that he modified three 
years later in (Ref. 70, p. 174) 


Pm t eo Va 
V2 " @2 V2 
aes ae 
x 
c= V2 ? (52) 
ae) 
y =y 
y= x 


where t’ is a local time. He used an assumption for length contraction as FitzGerald 
and Lorentz did. His change of coordinates was close to what Lorentz’s introduced 
in 1899 (therefore two years after Larmor). 

Larmor thus obtained (Ref. 70, p. 176) 


the result, correct to the second order, that if the internal forces of a mate- 
rial system arise wholly from electrodynamic actions between the systems 
of electrons which constitute the atoms, then an effect of imparting to 
a steady material system a uniform velocity of translation is to produce a 
uniform contraction of the system in the direction of the motion, of amount 


(53) 


We saw few variants of the electrodynamical theory. None of them had clearly 
reached a preferential status, each of them having its own particularities leading 
them to be as probable as the others. Only a rigorous mathematical and physical 
analysis, using a uniform notation (as we did in this paper) for making possible 
comparisons between them could allow to discriminate them. Such analysis was 
performed by Poincaré who wrote (Ref. 71, Introduction) 


Although none of these theories seems to me fully satisfactory, each one 
contains without any doubt a part of the truth and comparing them may 
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be instructive. From all of them, Lorentz theory seems to me the one which 
describes in the better way the facts. 


7. Poincaré’s Contribution 


Henri Poincaré started to work on light and optical phenomena when he got the 
chair previously occupied by Gabriel Lippman at the Faculté des Sciences de Paris 
in 1886. In his lectures, taught at La Sorbonne (Paris) in 1887-1888 (see Ref. 72) 
and published three years later, Poincaré made an overview of all the existing optical 
theories with a rigorous and quite extensive mathematical analysis. His main task 
was to determine which theory was explaining the largest number of phenomena. 
For Poincaré, scientific law must be general to be of interest and for one to be 
confident enough in it. This was justified as follows (Ref. 73, p. 306): 


These principles result from highly generalized experiments; but they seem 
to borrow from this generality a huge degree of certainty. The more gen- 
eral they are, the more often they can be checked, and these verifications, 
in becoming more numerous, in taking the most various and unexpected 
forms, lead by not leaving any doubt. 


But Poincaré did not forget his philosophy and thus added (Ref. 72, p. I) 


Mathematical theories do not aim to reveal the true nature of things; this 
could be an unreasonable pretention. Their unique aim is to coordinate 
physical laws that experience taught to us, but that, without the help of 
mathematics, we could not even state. 


To complete Poincaré’s views on the status of theories, it should be stated that for 
him, “any generalization is an hypothesis”: From that point of view, a theory is 
nothing else than a possible explanation. It cannot be the truth, not even a partial 
truth. Such an approach opened the way followed by Karl Popper (1902-1994) who 
pushed the idea that a theory is the expression of the most advanced knowledge at 
a given time.” Poincaré also wrote that” 


experiment [...] alone can teach us something new; it alone can give us 
certainty. It is not sufficient merely to observe: We must use our obser- 
vations, and for that purpose we must generalize. Mathematical physics 
[...] must direct the generalization, so as to increase [...] the output of 
Science. 


It is therefore understood why Poincaré always used the conditional form when he 
presented his results; he only considered them as a possible representation of the 
phenomena under study. 

In all pre-existing theories, an ether was required from the undulations of 
which the light and/or electromagnetic field can be propagated. However Poincaré, 
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according to his philosophical approach to theories, thus stated that (Ref. 72, p. I) 


it does not matter that the ether actually exists; this is metaphysicians’ 
business; what is relevant for us is that it is as if it would exist and that 
this assumption is convenient to explain phenomena. After all, do we have 
another reason to believe in the existence of material objects? This is only 
a convenient assumption; and it will be always like that, while someday 
will come for sure where the ether will be rejected, being useless. But this 
day, laws for optics and equations which analytically described them will 
remain true, at least as a first approximation. It will be always useful to 
investigate a doctrine linking all together these equations. 


From his mathematical point of view, introducing an ether is reduced to a 
simplification of the mathematical description of the electromagnetics phenomena 
(Ref. 75, p. 1171) 


Does our ether actually exist? One knows from where is coming the belief 
in the ether? If the light arises from a distant star, during many years, it is 
no longer on the star and it is not yet on the Earth; it is necessary that it 
is somewhere and, supported, so to speak, by some material support. One 
may express the same idea under a more mathematical and a more abstract 
form. What we observe, there are the changes affecting material molecules; 
[...]. In the classical mechanics, the state of the system under study only 
depends on its state at an immediately preceding time; the system thus 
obey to some differential equations. Contrary to this, if we do not believe 
in the ether, the state of the material universe would depend not only on 
the immediately preceding state, but also on much older states; the system 
would obey to some finite-difference equations. This is for escaping to this 
exemption to the general laws of mechanics that we invented the ether. 


Poincaré reminded us that the ether was supposed to fill transparent material 
medium as well as the interplanetary space, just because (Ref. 72, p. 379) 


it is not possible to conceive light propagation from the Sun to the Earth 
without the existence of an elastic medium. Contrary to this, it can be 
unnecessary and perhaps philosophical to assume the existence of an ether 
in material media. Nevertheless, the phenomenon of astronomical aberra- 
tion that evidences the relative motion of the ether and the ponderable 
medium that it penetrates seems to be absolutely opposed to the removal 
of this hypothesis; or at least, if this hypothesis is rejected, the explanation 
of the astronomical aberration would present so many difficulties that its 
maintenance is desirable. 


This comment mainly results from some experiments which were performed with 
an astronomical telescope filled with air or with water: No difference was found. 
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This is therefore an experiment more or less similar to that of Fizeau which is here 
discussed by Poincaré under the name aberration. 

Thus, in order to compare the main optical theories, Poincaré had on the one 
hand to propose a rigorous theory of the ether which was part of all existing theories 
and, on the other hand, to unify their mathematical expressions. For investigating 
the light propagation according to the wave theory which was then the most com- 
plete, Poincaré assumed that light waves are based on a molecular hypothesis. He 
thus devoted a complete study to the small movements of the distinct molecules con- 
stituting the elastic medium; he thus considered a discontinuous matter. He repro- 
duced the most widely accepted description of what should be the ether, that is, 
“formed of molecules distant one from the other” (Ref. 72, p. 1) which are governed 
by some forces pushing them out of their equilibrium states and, left to themselves, 
“oscillate by very small motions around their equilibrium state” (Ref. 72, p. 1). The 
medium is considered as isotropic, meaning that any plane is a plane of symmetry 
and that the equations are left invariant when two or three axes are permuted or 
when z is replaced with —x. In other words, the equations do not depend on a 
particular choice for axes (Ref. 72, p. 23): 


There exists certain functions of the coefficients of these equations known 
under the name of invariants, which are not dependent on the choice for 
the axes; they must be isotropic functions. One of the invariants is the sum 
of the squared coefficients of the squared terms. 


Poincaré retained Fresnel’s results for wave propagation according to which “exper- 
iments show that vibrations of ether are always transverse” (Ref. 72, p. 53). He 
specified that (Ref. 72, p. 65) 


in experimental studies in optics, it is not possible to directly determine the 
direction for the vibrations of the ether propagating rectilinearly polarized 
light; what one might observe is that phenomena depend on the position 
of a certain plane, the so-called plane of polarization. By symmetry prop- 
erties, the direction of vibrations must be either in the polarization plane 
or perpendicular to this plane. Fresnel admits that it is perpendicular to 
it, other scientists preferred the opposite hypothesis. 


He explained that the quantities occurring in the equations describing the transverse 
movements “can be considered as constant with respect to the duration of a certain 
number of vibrations” (Ref. 72, p. 66), the velocity c of light propagation being 
considered as an absolute value in a homogeneous medium. 

Poincaré used these mathematical elements characterizing the ether to establish 
a detailed mathematical analysis of various optical phenomena commonly consid- 
ered by existing optical theories: Reflection, refraction, diffraction, dispersion, dou- 
ble refraction, aberration.... Poincaré choose to investigate aberration by using 
a wave theory, the ether being considered as the support to light propagation 


I-54 V. Messager and C. Letellier 


(Ref. 72, p. 379): 


The phenomenon of astronomical aberration, which evidences the relative 
motion between the ether and the ponderable medium that it penetrates, 
seems to argue against the removal of this assumption; or at least, if this 
hypothesis was rejected, the explanation of the astronomical aberration 
would encounter such difficulties that its upholding is preferable. 


Taking into account Fresnel’s results as well as Michelson and Morley’s experiment, 
both related to an ether partly driven, Poincaré applied the composition law for 
velocities for evaluating the light velocity c with respect to motionless axes in space, 
that is (Ref. 72, p. 387), 


1 
e=d+V(1- =) cos ¢, (54) 


where c’ would be the absolute value of light velocity, V(1— <=) the absolute driving 
velocity of the ether and ¢ the angle between these two velocity vectors. 

When the moving axes are permanently linked to the medium in motion, the 
velocity c of the light becomes (Ref. 72, p. 388) 


c=c'—V'cos¢, (55) 


where V’ is the relative driving velocity of the ether. It is this last expression that 
Poincaré used to express the duration spent by light to travel the distance between 
a point Ag and a point A, in a moving medium, that is (Ref. 72, p. 389), 


Aids . V 
s = aa eho Ans (56) 


where c’ is the absolute light velocity, V is the displacement velocity of the medium, 
c= © and AjAy, is the projection of AgA,, on the x-axis. 


The first term 5° —— represents the duration that light would spend when 
the medium is at rest, the second only depends on the location of the 
extreme points and by no means of the path traveled by light for going 
from one point to the other (Ref. 72, p. 389). 

An important consequence of the preceding formula is that reflection 
and refraction laws, interference phenomena are not affected by the Earth 


motion. (Ref. 72, p. 389) 


If one forgets to look for a mechanical explanation of light propagation and if 
one does not consider valid the assumption that light emitted by stars depends 
on their size as supported by Michell and Arago, one would have no difficulty to 
describe the astronomical aberration as Bradley did. All the problem arises when 
one wants to provide an explanation of light propagation (we clearly distinguish 
“describing” from “explaining” ). They were considering (Poincaré included) the 
relative motion with the implicit aspect of the composition law for velocities, but 
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never clearly addressed the problem in terms of the distance traveled by light. This 


is well evident from the statement of Poincaré that (Ref. 72, p. 391) 


optical phenomena only evidence relative motions with respect to the obser- 
vations of the light source and of ponderable matter. This is what happens 
in aberration where the observer and the observed star are not animated 
with the same motion; this is what happens in Fizeau’s experiments where 
the water contained in the tubes has a relative motion with respect to the 
observer. A single fact would not correspond to these conclusions, this is 


the variation of the polarization plane of reflected light. 


But Poincaré, as his predecessors, was deeply concerned by “explanation”, and was 


thus forced to remind us of the weak status of any theory (Ref. 72, p. 398): 


It is quite difficult to explain these aberration phenomena, and there is no 
satisfactory theory. There is no sufficient reason to choose a theory [... ]. 
Indeed, we cannot complain to be in the impossibility to make a choice. This 
impossibility shows us that mathematical theories for physical phenomena 
must be only considered as research tools; very precious tools, this is true, 
but from which we must not remain as slaves and that we must reject as 


soon as they are in an actual contradiction with experiments. 


The second monograph resulting from Poincaré’s lectures is mostly devoted 
to Maxwell’s electromagnetic theory. For Poincaré, Maxwell’s theory is more a 
theory for electromagnetic phenomena than for light propagation and therefore 
deserved a specific monograph. According to Poincaré, Maxwell’s aim was (Ref. 76, 


p. 192) 


to find an explanation for electric and electromagnetic phenomena, com- 
monly attributed to a force acting at a distance, by means of an hypothetic 


fluid filling the space. 


Poincaré summed up Maxwell’s approach as follows (Ref. 76, pp. XIV-Xv): 


In order to demonstrate the possibility for a mechanical explanation of 
electricity, we do not have to worry for finding an explanation in itself, it 
is sufficient for us to know the expressions of the two functions T’ and U 
[describing the kinetic energy and the potential energy, respectively] which 
are the two parts of the energy, to construct with these two functions 
Lagrange’s equations, and to compare these equations with experimental 


laws. 


He was clearly motivated for investigating Maxwell’s theory by its unifying char- 


acter between, on the one hand, the ether which propagates light according to 
Fresnel and, on the other hand, Maxwell’s fluid used for explaining electromagnetic 
phenomena: Both fluids have the same properties in Maxwell’s works. Such a corre- 
spondence was already pointed out by Maxwell (Ref. 43, Vol. 1, p. 431) as quoted 
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by Poincaré himself: 


To fill all the space with a new medium whenever any new phenomenon 
is to be explained is by no means philosophical, but if the study of two 
different branches of science has independently suggested the idea of a 
medium, and if the properties which must be attributed to the medium in 
order to account for electromagnetic phenomena are of the same kind as 
those that we attribute to the luminiferous medium in order to account 
for the phenomena of light, the evidence for the physical existence of the 
medium will be considerably strengthened. 


Poincaré then wrote (Ref. 76, pp. 193-194): 


The ether and Maxwell’s fluid having the same properties, light must be 
considered as an electromagnetic phenomenon and the vibratory move- 
ment which produces on our retina the impression of a light intensity must 
result from periodic perturbations of a magnetic field. If this is so, from 
general equations for this field must be deduced the explanation of light 
phenomena. 


Poincaré saw in each case of experimental evidence where the optical and electri- 
cal constants of a given body are found with nearly (if not) equal values as new 
“indirect but convincing” (Ref. 76, p. 194) validations of the electromagnetic theory 
of light. He considered as one of the best validations the values for the velocity of 
light propagation found by Foucault, Fizeau and Cornu (for instance, Alfred Cornu 
(1842-1902) found 3.0004: 10° m-s~! in 1876 (see Ref. 77)) and the value deduced 
from the electromagnetic theory. In fact, Poincaré remarked that the equations 
governing the propagation of an electromagnetic perturbation were similar to those 
governing the movements of an ether molecule, a similarity that he considered as a 
“confirmation of the assumption concerning the electromagnetic nature of luminous 
vibrations” (Ref. 76, p. 197). Moreover, these former equations lead to transverse 
periodic electromagnetic perturbations (as Maxwell showed) propagating with the 
velocity 

(57) 
which reduces to c = Sz in the vacuum since jz is the permeability coefficient equal 
to 1 in the electromagnetic system of units and K is the permittivity. He found, as 
Maxwell, that c is also the quantity of electricity (in the electromagnetic system of 
units) in one unit of electromagnetism. 

His study of Hertz’s experiments where the velocity of electromagnetic waves 
was found with the same order of magnitude than light velocity led him to conclude 
that “this is again a very satisfying validation of the electromagnetic theory of 
light, if one takes into account the difficulties in measuring the quantities involved 
in Hertz’s computations” (Ref. 76, p. 203). The double refraction, which is one 
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of the most complex optical phenomena to explain, was fully explained using the 
electromagnetic theory of light as shown by Poincaré, proving the validity of this 
theory. 

After the optical theories, Poincaré paid his attention to the electrodynamic the- 
ories leading to a new monograph published in 1891.° In his lectures, he reviewed 
the main electrodynamic theories of those times which were those of Ampére, 
Weber, Helmholtz and Maxwell. He paid particular attention to Helmholtz’s aims 
to have a more general theory than the others and comparable to Maxwell’s. He also 
investigated how Hertz’s experiments could allow one to determine which theory 
could be the best one. In doing this, Poincaré thus showed that Ampére’s theory 
(based on a force acting at a distance) is not a particular case of Helmholtz’s (based 
on the action of a potential) and is, in fact, the single one which is able to explain 
the facts by action between two elements reduced to one force along the straight 
line joining them. As soon as one admits that Ampére’s law results from a potential 
as assumed by Helmholtz’s theory, and since the potential depends on the orien- 
tation of the current elements, “derivatives [of the potential] with respect to the 
angles defining its orientation are not identically null” (Ref. 29, p. 51). Applying 
the principles of mechanics to Helmholtz’s theory, Poincaré showed that it provides 
an unstable equilibrium propagation when the constant k is negative (corresponding 
to Weber’s theory), and that it must be rejected in this case. For k = 0 (Maxwell’s 
theory) and k = +1 (Neumann’s theory), Poincaré showed that it is possible to 
switch from Helmholtz’s theory to Maxwell’s by reducing the parameter \ — being 
equal to “1 in the system of electrostatic units and is the square of the light veloc- 
ity in the electromagnetic system” (Ref. 29, p. 56) — to an infinitely small value: 
Maxwell’s theory is thus a limiting case of Helmholtz’s theory. He thus concluded 
that (Ref. 29, p. 110) 


in Maxwell’s theory, there are only transverse vibrations and their propa- 
gation velocity V2 is equal to the velocity c of light |...]. If one sets A to a 
positive value different from 0, one has for V2 a velocity greater than the 
velocity of light [which is not possible. Consequently,] Maxwell’s theory can 
thus [only] be deduced from Helmholtz’s by setting A = 0. 


Poincaré also preferred Maxwell’s theory because, on the one hand, “the ratio c 
among units is equal to the light velocity and is very well explained in it” (Ref. 29, 
p. 114) and, on the other hand, the governing equations were written in a “very 
elegant form” by Hertz. In particular, Poincaré showed that the equations govern- 
ing the electric displacement and those governing the current can be written as a 
function of the magnetic induction, thus exhibiting the correspondence between the 
electric displacement and the magnetic force (Ref. 29, pp. 115-116). 

For choosing the best electromagnetic theory, Poincaré investigated whether 
the so-called “principle of the unity of the electric force” (Ref. 29, p. 122) is in 
agreement with all possible theories or not. This principle, introduced by Hertz, is 


I-58 V. Messager and C. Letellier 


described by Poincaré as follows (Ref. 29, p. 123): 


We will admit for electricity a principle analogous to the principle that 
everybody admits for magnetism. A magnet in a ring shape and whose 
magnetism varies or, what is equivalent, a closed solenoid traversed by a 
variable current, is equivalent to an electric layer of a convenient power, 
from the electric field point of view that it produces. It will act as this layer 
onto another electric layer; and, according to the principle of action and 
reaction, will receive from this second layer a reaction equal and opposite 
to the applied action. Thus, a variable closed solenoid in an electric field 
receives a mechanical action; and, as such a solenoid produces an electric 
field, two variable closed solenoids apply one onto the other a mechanical 
action identical to the action produced by two equivalent electric layers. 
This is the principle of the unity of the electric force. 


Maxwell’s theory was the single one found by Poincaré to obey to this principle: 
This was yet another argument in favor of this theory. 

The last of Poincaré’s monographs on electrodynamic theories was published in 
1901 and was devoted to the electrodynamic theories developed by Hertz, Lorentz 
and Larmor, respectively. As done in his previous studies, he mathematically inves- 
tigated these theories and confronted them with the principles of mechanics. The 
detailed analysis of Hertz’s theory for bodies at rest and his comparison to Maxwell’s 
allowed Poincaré to show that the two theories are in agreement in every point with 
the exception of “the expression of the magnetic energy of Hertz |which| is thus 
the single acceptable one” (Ref. 71, p. 362). A similar evaluation of Hertz’s and 
Maxwell’s theories for bodies in motion allowed Poincaré to determine the equiv- 
alence between these two theories and to state about their conformity with the 
principles of mechanics. He also demonstrated that (Ref. 71, p. 388) 


Hertz’s equations keep the same form when we adopt motionless axes as 
well as when we adopt moving axes; in other terms, Hertz’s equations keep 
the same form in relative motion as well as in absolute motion. 


Such a demonstration led him to the following observation (Ref. 71, p. 389): 


It thus results from the theory that the derivative a plays with respect to 
relative motion, the same role as played by the derivative oa with respect 
to absolute motion. 

This last remark induces two consequences: One is fortunate, the other 
is annoying. The fortunate consequence is that Hertz’s equations are con- 
sistent with the principle of action and reaction; the annoying consequence 


is that these equations cannot describe certain optical phenomena. 


Hertz was already aware of such a weakness in his theory and Lorentz tried to over- 
come it. Consequently, Hertz’s theory was valid for describing electrical phenomena 
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and consistent with the principle of mechanics but was unable to describe optical 
phenomena in motion (Ref. 71, p. 422). 

In contrast to this, as Poincaré also showed that, Lorentz’s theory “quite 
well explains optical phenomena which were not explained by Hertz’s theory but, 
unfortunately, it is not consistent with the principle of action and reaction” (Ref. 71, 
p. 422). The main difference between these two theories, as evidenced by Poincaré, 
is related to the assumptions added by Lorentz, that is, there is “neither magnetism, 
nor dielectric other than the vacuum” (Ref. 71, p. 435). Moreover, in order to explain 
optical phenomena by Lorentz’s theory, it must be admitted that (Ref. 71, p. 518): 


If one wants that optical phenomena are not affected by the Earth motion, 
one must neglect in the formulas terms of the order of the squared aber- 
ration [...]. In almost all experiments, these terms are indeed negligible; 
there is however one exception for Michelson’s experiment which shows 
that the Earth motion has no influence on optical phenomena observed 
at the terrestrial surface and where it is found that terms of the order of 
aberration are no longer negligible. 


Then description and comparison of the main existing theories allowed Poincaré 
to evidence what they have in common. He thus listed the conditions that any 
electrodynamical theory for a moving body should obey (see Ref. 78 and p. 602 of 
Ref. 71): 


(i) It should explain Fizeau’s experiments, that is, the partial driving 
of light waves or, which is equivalent, of transverse electromagnetic 
waves. 

(ii) It must verify the principle for the conservation of electricity and 
magnetism. 

(iii) It should be compatible with the principle of the equality between 
action and reaction. 


These conditions are imposed by known experiments. They are necessary condi- 
tions, matching with Poincaré’s metaphysics according to which “experience is the 
sole source of certainty.” Concerning the theories successively developed by Hertz, 
Helmholtz and Lorentz, Poincaré established that none of them simultaneously ful- 
filled to these three conditions: 


e The theory of Hertz** satisfies the conditions (ii) and (iii); 
e the theory of Helmholtz” satisfies the conditions (i) and (iii); 
e the theory of Lorentz°?"®! satisfies the conditions (i) and (ii). 


He thus concluded that (Ref. 71, p. 611): 


One may ask whether this is due to the fact that these theories are not 
completed or whether these three conditions are actually compatible or 
would become so only by a deep modification of the admitted assumptions 
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[...]. It is thus needed to give up on developing a perfectly satisfying theory 
and to temporarily retain the less flawed one which seems to be Lorentz’s 
theory. 


To overcome these problems, the next important step was related to the measure 
of time for describing physical phenomena which is a key point for writing the 
coordinate transformation for switching from a frame at rest to a moving frame.®° 
The relevant question asked by Poincaré and related to these problems was: “Can 
we reduce to a single measure facts which occur in different worlds?” (Ref. 80, 
p. 2). A related question was “can we transform a psychological time, which is 
qualitative, into a quantitative time?” (Ref. 80, p. 2). Poincaré already answered 
this last question, asserting that “we have no direct intuition of the equality of two 
intervals of time. The persons who believe they possess this intuition are deceived 
by an illusion” (Ref. 80, p. 2). He thus explained (Ref. 80, p. 3): 


When I say, from noon to one the same time passes as from two to three, 
what meaning has this affirmation? The least reflection shows that by itself 
it has none at all. It will only have the one which I choose to give it, by 
a definition which will certainly possess a certain degree of arbitrariness. 
To measure time [physicists and astronomers] use the pendulum and they 
suppose by definition that all the beats of this pendulum are of equal 
duration. But this is only a first approximation; temperature, resistance of 
the air, barometric pressure, make the rate of the pendulum vary. 


This is nothing else than a physical justification for the local character of the 
time since these physical conditions (temperature, pressure, etc.) depend not only 
on time but also on location. This is here a first, perhaps a little bit “naive”, 
consideration of what will be became the concept of a local time. 

Poincaré then considered how the unit of time is defined, leading to the conclu- 
sion that there is no rigor in its definition: “When we use the pendulum to measure 
the time [we implicitly admit that] the duration of two identical phenomena is the 
same or, if we prefer, that the same causes take the same time to produce the 
same effects” (Ref. 80, pp. 3-4). In order to do that, physicists and astronomers 
use the conservation of energy and Newton’s laws which are only approximations 
since they are deduced from experiment. Consequently, Poincaré cannot avoid to 
conclude that “there is no manner to measure time which is more ‘true’ than 
another; the one which is commonly adopted is only the most convenient.” (Ref. 80, 
p. 6). 

Considering the problem of how to determine the simultaneity of two phenom- 
ena, he pointed out the singular role played by light in this question (Ref. 80, 
p. 11): 


When an astronomer tells me that some stellar phenomenon, that his tele- 
scope reveals to him at this moment, happened, nevertheless, fifty years 
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ago, I seek his meaning, and to that end I shall ask him first how he knows 
it, that is, how he has measured the velocity of light. 

He has begun by “supposing” that light has a constant velocity, and in 
particular that its velocity is the same in all directions. This is a postulate 
without which no measurement of this velocity could be attempted. 


Poincaré thus insisted on the fact that when optical measurements are performed, it 
is always implicitly assumed that the light velocity is constant in a space which was 
considered as isotropic and homogeneous. More explicitly, a theory is required to 
justify the choice of the quantity measured and how it is related to the assumption 
tested. In the case considered, Poincaré insisted on the fact that the assumptions 
(the light velocity) and the theory are not independent: There is an inner consis- 
tency but the explanation provided is not unique. Once again, the retained solution 
was selected according to its simplicity. More importantly, Poincaré pointed out 
that it is impossible to dissociate the concept of simultaneity from the measure of 
time. This problem is necessarily associated with any change of reference frame. 

Poincaré then revisited the principles of mechanics,®! distinguishing what is 
learnt from experiment from what is obtained by mathematical reasoning, the latter 
only being a convention or an assumption. He started with a severe criticism of the 
foundations on which Newton’s Principia Mathematica were constructed (Ref. 81, 
p. 458): 


(i) There is no absolute space, and we only conceive of relative motion 

(ii) There is no absolute time [...]. 

(iii) Not only we have no direct intuition of the equality of two periods, 
but we have not even direct intuition of the simultaneity of two events 
occurring in two different places. 

(iv) Finally, is not our Euclidean geometry in itself only a kind of conven- 
tion of language? Mechanical facts might be enunciated with reference 
to a non-Euclidean space. 


[Consequently], we might endeavour to enunciate the fundamental laws 
of mechanics in a language independent of all these conventions [...]. No 
doubt that the enunciation of these laws would become much more compli- 
cated, because all these conventions have been adopted for the very purpose 
of abbreviating and simplifying the enunciation. (Ref. 81, p. 459) 


All these considerations led Poincaré to introduce a local time and a non-Euclidean 
space. 

In the same year (1900), Poincaré wrote some comments on the theory of 
Lorentz’ in a compendium to celebrate the 25th anniversary of Lorentz’s Ph.D. 
thesis. This theory was considered by Poincaré as the least bad theory, mainly 
because “the principle of relativity of motion has been verified only imperfectly.” 


In fact, this principle is only verified when is neglected. The main criticism 
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provided by Poincaré concerned the principle of action and reaction which was no 
longer satisfied when it was applied to the matter alone, that is, when no momentum 
was attributed to light (Ref. 82, p. 269): 


In order to show experimentally that the principle of reaction is broken in 
reality as it is in Lorentz’s theory, it was not sufficient to show that the 
devices producing energy undergo a recoil, which would be already quite 
difficult to do, it should be also shown that this recoil is not balanced by the 
movement of dielectrics and, in particular, by the movement of air crossed 
by electromagnetic waves, which would be obviously much more difficult 
to prove. 


To argue that this objection is related to relative motion and not to absolute motion, 
Poincaré developed the example as follows (Ref. 82, pp. 270-271): 


Let A and B be two bodies, acting on each other, but not under any external 
action; if the action of one of each was not equal to the reaction of the other, 
one could attach one to the other with a rod of constant length in such a 
manner that they behave as a single solid body. The forces applied to this 
body being not at the equilibrium, the system would be in movement and 
this motion would go with a constant acceleration but at one condition, 
that is, the mutual action of the two bodies only depends on their relative 
positions and their relative velocities, but is independent on their absolute 
positions and their absolute velocities. 

The principle of reaction thus appears as a consequence of the principles 
of the [conservation of] energy and of the relative motion. 


For Poincaré, Lorentz’s theory can only be in agreement with experimental facts 
if “phenomena are related, not to the true time t but to a certain local time t’” 
(Ref. 82, p. 272), that he defined as follows (Ref. 82, pp. 272-273): 


I suppose that some observers placed at various points, synchronize their 
clocks using light signals. They attempt to correct these signals from the 
transmission duration, but they are not aware about the motion of trans- 
lation with which they are animated and thus, believing that these signals 
travel equally fast in both directions, they limit themselves to cross the 
observations, sending one signal from A to B, followed by another one from 
B to A. 

The local time ¢ is the time indicated by the clocks which are so 
adjusted. 

If cis the speed of light, and V is the speed of the Earth that we suppose 
to be parallel to the x-axis in the positive direction, one will have 


t=t-—. (58) 
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The apparent energy propagates in a relative motion according to the same 
laws as the actual energy in absolute movement, but the apparent energy 
is not exactly equal to the actual corresponding energy. 

In relative movement, the bodies producing the electromagnetic energy 
are subject to an apparent complementary force which does not exist in 
absolute movement. 


It is important to note here that due to the existence of a local time, some effects, 
as those measured in a “resting frame” are only apparent, due to a change of refer- 
ence coordinates. This is a very important aspect which will be rediscussed a little 
further on. 

From this argument, Poincaré concluded that, according to Lorentz’s theory, 
the principle of reaction cannot be applied only to ponderable matter and should 
be applied to light. This is what should be understood when Poincaré endowed a 
momentum to the matter and to a “fictional fluid” 8? He made this quite explicit 
when he wrote that “the electromagnetic energy behaves as a fluid which has inertia, 
we must conclude that, if any sort of device produces electromagnetic energy and 
radiates it in a particular direction, that device must recoil just as a cannon does 
when it fixes a projectile” (Ref. 82, p. 260). Poincaré then provided an explicit exam- 
ple with an “Hertzian exciter placed at the focus of a parabolic mirror” (Ref. 82, 
p. 260): 


It is easy to evaluate the recoil quantitatively. If the device has a mass of 

1kg and if it emits three million joules in one direction with the velocity 

of light, the speed of recoil is 1em-s7!. 
Due to the context in which this example occurred, it is very likely that Poincaré 
used a formula for the momentum like p = z. where p is the momentum and EF 
the energy.t Consequently, the principle of relative motion should be applied to 
ponderable matter and to electromagnetic fields. Nevertheless, Fizeau’s experiment 
was still in contradiction with the principle of reaction. Such a feature led Poincaré 
to conclude that, if (Ref. 82, p. 278) 


the driving of waves is only partial, this is due to the relative propagation of 
waves in a moving medium which does not obey the same law as the prop- 
agation in a resting medium, that is, the principle of a relative movement 
does not only apply to matter and it is required to apply a correction [... | 
which consists in the introduction of a local time. If this correction is not 


Such a relation has to be used for getting the numerical value for the speed of recoil. It was 
correctly introduced by Einstein in 1917.8° There is another possibility to obtain this result 
by stating that the variation of the mass is related to the variation of the energy according to 
dm = Lf which was the form in which Einstein published the equivalence mass-energy in 1905®: 
“The mass of a body is a measure of its energy content; if the energy changes by dE, the mass 


changes in the same sense by ston , uf the energy is measured in ergs and the mass in grams.” 
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balanced by some others, we shall conclude that the principle of reaction 
is not true for the matter. 

Thus all theories obeying this principle would be condemned, unless we 
would consent to change all our ideas on electromagnetism. 


In 1902, Poincaré published a book entitled Science and Hypothesis.®° Often 
considered as written for a broad audience, this book is in fact just a concatenation 
of already published papers from which mathematical equations were removed. 
Two years later, after the success of the first book, Poincaré published in 1904 a 
second book entitled The Value of Science.8° The construction of this second book 
is similar to the first one. In it, Poincaré improved his idea that there is a need for 
a new mechanics (Ref. 86, Par. 197): 


From all these results, if they were confirmed, would arise an entirely new 
mechanics, which would be, above all, characterized by this fact, that no 
velocity could surpass that of light, because bodies would oppose an increas- 
ing inertia to the causes which would tend to accelerate their motion; and 
this inertia would become infinite when one approached the velocity of 
light. 


This assertion results from crucial experiments conducted by Walter Kaufmann 
(1871-1947)®: 


I was able to report about an experiment with the result that the ratio 
=, of Becquerel rays would decrease with increased velocity, and m would 
increase if one assumes e as constant, namely it increases the quicker, the 
more the velocity V would approach the speed of light c. Such a behavior is 
theoretically given from the equation of energy of a quickly moving electric 
charge. 


Such an experiment was motivated by the theoretical study developed by Abra- 
ham®’ and Kaufman.® 

The year of 1904 was also the year of an important conference in Saint Louis 
(Missouri) where Poincaré gave a talk entitled L’ état actuel et l'avenir de la physique 
mathématique,"? which was included in his second book The Value of Science. As 
we already mentioned, the concept of “principle” was very important for Poincaré 


(Ref. 81, p. 491): 


The principles of mechanics are presented to us under two different aspects. 
On the one hand, there are truths founded on experiment, and verified 
approximately as far as almost isolated systems are concerned; on the 
other hand, they are postulated applicable to the whole of the universe 
and regarded as rigorously true. If these postulates possess a generality 
and a certainty which are lacking in experimental truths from which they 
were deduced, it is because they reduce in final analysis to a simple conven- 
tion that we have a right to make, because we are certain beforehand that 
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no experiment can contradict it. [...] We admit it because certain experi- 
ments have shown us that it will be convenient, and thus is explained how 
experiments have built up the principle of mechanics, and why, moreover, 
it cannot reverse them. 


Retrospectively, in 1895, Poincaré only evoked “the principle of [conservation 
of| energy” in his lectures at La Sorbonne on the mathematical theory of light.” 
In his text on the principles of mechanics,®! Poincaré discussed five principles: 


The principle of inertia; 

The law of acceleration; 

The principle of reaction; 

The principle of relative motion; 
The principle of energy. 


These principles summarized all of Poincaré’s questioning about light propagation 
arising from Fizeau’s and Michelson and Morley’s experiments. The principle of 
relative motion was enunciated as follows (Ref. 81, p. 477): 


The movement of any system whatever ought to obey the same laws, 
whether it is referred to fixed axes or to the movable axes which are implied 
in uniform motion in a straight line. 


This is clearly the invariance of the laws under a change of reference frame, when 
one is related to the other by a constant velocity. Nevertheless, in 1900, the trans- 
formation from one frame to the other was not known. The “classical” composition 
law for velocities was clearly not working for explaining optical experiments such 
as Michelson and Morley’s. In his 1904 talk at Saint Louis, Poincaré discussed six 
principles ranked as follows (Ref. 73, p. 306): 

(i) The conservation of energy; 

(ii) the degradation of energy; 
(iii) the equality of action and reaction; 
(iv) the principle of relativity; 

(v) the conservation of mass; 
(vi) the principle of least action. 


Poincaré’s discussion of these six principles expressed all his doubts. We could 
say that this was the year of questionings induced by a crisis. The law of accel- 
eration, in fact the fundamental principle of dynamics, was no longer discussed 
but the principle of least action was added. The degradation of energy was intro- 
duced by Sadi Carnot (1796-1832).8” Energy seemed to be conserved but it is 
degraded when its form (mechanical, electrical, thermal, etc.) was changed. The 
principle of action and reaction was not verified by Lorentz’s theory. The prin- 
ciple of relativity was enunciated in a slightly different way compared to 1901 
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(Ref. 73, p. 306): 


The laws of physical phenomena must be the same for a stationary as well 
as for an observer carried along in a uniform motion of translation; so that 
we have not and cannot have any means of discerning whether or not we 
are carried along in such a motion. 


This principle remains the key point to solve the problem posed by light prop- 
agation. In 1904, it was not yet possible to apply them to optical experiments. 
Kaufmann’s experiments seemed to contradict the conservation of mass since the 
mass was varying with the velocity. Even the conservation of energy was subject 
to doubt, due to Pierre Curie and Albert Laborde’s experiments on radioactive 
elements which showed that a significant amount of energy was emitted.** Poincaré 
then concluded his talk as follows (Ref. 73, p. 324): 


Perhaps, too, we shall have to construct an entirely new mechanics that we 
only succeed in catching a glimpse of, where, inertia increasing with veloc- 
ity, the velocity of light would become an impassable limit. The ordinary 
mechanics, more simple, would remain a first approximation, since it would 
be true for velocities not too great, so that the old dynamics would still be 
found under the new. 


Less than one year after his talk given at Saint-Louis, Poincaré proposed his 
own version of the dynamics for electrified particles and, more particularly, the 
dynamics of the electron resulting from his analysis of the theories published by 
his predecessors. He based his investigations on Lorentz’s theory®* and Langevin’s 
theory.®? Poincaré quickly stated that, although interesting because it was only 
based on electromagnetic forces and binding forces, it is not compatible with the 
principle of relativity, as initially shown by Lorentz and then by himself. 

Poincaré’s theory was published in two papers, one short note read at the 
Academy of Sciences (Paris) on June 5, 1905 (see Ref. 91) which is only a summary 
of Poincaré’s findings due to the length limitation to publish in Comptes-Rendus 
de l’Académie des Sciences, and one full paper, submitted on July 23rd, 1905 to the 
Rendiconti del Circolo Mathematico di Palermo and which was only published in 
1906. No doubt that most of the second paper was already written when Poincaré 
read his contribution on June 5, since he wrote “J show” (but no proof was given 
in the short note, in contrast to what was included in the full paper), “this is what 
I determined”. Poincaré was not used to announcing a result without at least a 
sketch of the proof. This is why we preferred to directly describe the full paper 
submitted on 23 July. 

Poincaré started by recalling that this entire problem was coming from the 
stellar aberration and some optical experiments, as the one conducted by Michelson 
and Morley, related to the driving of the ether. He clearly mentioned the principle 
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of relativity (Ref. 90, p. 129): 


It seems that this impossibility of demonstrating an experimental evidence 
for absolute motion of the Earth is a general law of nature; we are naturally 
led to admit this law, which we will call the Principle of Relativity and 
admit it without restriction. 


He then stated that FitzGerald’s and Lorentz’s contraction was sufficient when the 
principle of relativity was taken in all its generality.°° 

It was thus of importance to find how the principle of relativity can be applied to 
electromagnetic phenomena, that is, to what is today called the Maxwell equations. 
This is presented in Poincaré’s paper as follows (Ref. 90, p. 130): 


The idea of Lorentz can be summarized as follows: If we can bring the 
whole system to a common translation, without modification of any of 
the apparent phenomena, it is because the equations of the electromag- 
netic medium are not altered by certain transformations, which we will call 
Lorentz transformations; two systems, one motionless, the other in trans- 
lation, thus become exact images of one another. 


Indeed, one of the results provided in Lorentz’s paper is that the Maxwell equations 
are left invariant under the “Lorentz transformation”, which is the modern expres- 
sion of what the principle of relativity tells us when applied to Maxwell equations. 
Poincaré already tried to show this in his lecture performed in 1899 and published 
in 1901,”! but failed. His best results were obtained with the 1892 Lorentz’s the- 
ory. Lorentz developed his second theory to get such an invariance (and failed too) 
but, for him, this was only an intermediary step to explain optical experiments: 
Consequently, he did not give any comment about this result in his paper. 

In contrast to this, the invariance of Maxwell’s equations in Poincaré’s 1905 con- 
tribution is central. Moreover, he showed that Lorentz transformations — a name 
given by Poincaré in Refs. 90 and 91 — are shown to form a group, a mathematical 
concept usually used by mathematicians and on which Poincaré had largely worked 
and published in 1881 (see Refs. 92-95 and also see Gray for a review on Poincaré’s 
contribution to group theory®®). When properly applied, this transformation pro- 
vides the right composition law for velocities, a point missed by Lorentz. From the 
mathematical point of view, it allows one to understand what Poincaré had in mind 
when he wrote (Ref. 91, p. 576) 


the results which I obtained are in agreement with those of Lorentz on all 
important points; I was only led to modify and supplement them in some 
points of detail; one will further see the differences which are of secondary 
importance. 


He “just” corrected the composition law for velocities and its consequences in 
Lorentz’s theory. The equations for electromagnetic media are nothing else than 
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the Maxwell equations. Poincaré adopted Lorentz’s specific system of units, in par- 
ticular he used the electric displacement D and the magnetic force H in order to 
eliminate the factors 47 in the formulas. Poincaré also choose the units of length 
and time so that the speed of light was equal to 1 — as he mentioned in the 
letter sent to Lorentz in May 1905, in which he already gave the corrected trans- 


65,66 __ a convention very useful to emphasize the underlying symmetry 


formation 
between the electric and magnetic fields. The Maxwell equations used by Poincaré 


translated in the modern vectorial form are thus (Ref. 90, p. 132) 


dD dH 


[I= — = aaa 

— +pV=VAH, =-VaD 
, (59) 

Cav pV= ’ V-D=p 


where I designates the current, V is the velocity of the electrons, H is the magnetic 
force, D is the electric displacement and p is the electric density. Poincaré did not 
add to this set of equations the common 


V-H=0, (60) 


perhaps because, when he expressed D and H in terms of the scalar potential U 
and the vector potential A, he obtained (Ref. 29, p. 132) 


pA _vw 
dt ; (61) 
H=VAA 


Since the divergence of a rotation is necessarily null, by definition Eq. (60) holds. 
Nevertheless, Eq. (60) was explicitly justified by Poincaré when he investigated 
Lorentz’s electromagnetic theory (Ref. 29, pp. 426-427). Poincaré checked all these 
fundamental equations before any attempt to demonstrate their invariance under 
the Lorentz transformation. 

Poincaré also clarified his notations for partial derivatives: 


Our functions can be viewed (i) either being dependent on the five variables 
x,y, z,t, € in such a way that one always remains at the same place when 
only t and € are varied: We will thus designate their derivatives by ordinary 
“qd”; (ii) or being dependent on the five variables x, y, z, t, € in such a way 
that one always follows a single given electron when only ¢ and € are varied: 
We will designate their derivatives by “0”. 


Thus, Poincaré expressed the magnetic and electric fields in terms of the scalar 
potential V and the vector potential A. This means that the electromagnetic field 
results from a four-dimensional potential. From this, and using the gauge condition 


aAv+V-A=0, (62) 
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the Maxwell equations (59) can be rewritten in the very compact form (Ref. 90, 


p. 132) 
— 
_ ’ a (63) 
where 
= 2 —_— & 
dt? 


is the d’Alembertian as named after Lorentz.®* Poincaré thus added the Lorentz 
force (Ref. 90, p. 132) 


F, = pD+p(V AH). (64) 


All these equations were thus transformed by applying the “remarkable transfor- 
mation discovered by Lorentz” (Ref. 90, p. 132) 


t’ = l(t + Ga) 


T, = av = (a+ a (65) 


where / and @ are two arbitrary constants used by Lorentz, and where (Ref. 90, 
p. 132) 


1 
ers (66) 
is the contraction coefficient introduced by Lorentz. The Lorentz transformation 
was the missing key element Poincaré needed to show the invariance of the Maxwell 
equations. He was thus rewarded with this transformation for having pushed 
Lorentz to write a second theory. 
Poincaré then introduced the inverse transformation (Ref. 90, p. 132) 


t= qe! — Bx’) 


z= F(e" — Bt’) 
T;,' = ; (67) 
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and also introduced the right composition law for velocities (Ref. 90, p. 133): 


vy! = = = 
"dt! d(t+@r) 14+ Vz 
! dy’ dy Vy 
—- ——, — . 68 
“y= ae ~ yd(t-+ Bx) ~ 71+ BVe) me 
Vie dz! _ dz = Vz 
* dt! yd(t+ Bx) y(1+ BVz) 


This is one of the very relevant contributions by Poincaré to the principle of relativ- 
ity: He was the first to understand that the “classical composition law for velocities” 
no longer applies when the Lorentz transformation is used. Poincaré was thus able 
to obtain the continuity on the new electric density p’ when the electric charge of 
the electron is kept constant (Ref. 90, p. 134): 

d / 

BE 4 PN =, (69) 

dt! 
A departure from the second theory of Lorentz was remarked by Poincaré: The 
value for this new density was found by Lorentz to be (Ref. 90, p. 133) 


1 
/ — 
p= Fa5P: (70) 


that Poincaré corrected into (Ref. 90, p. 133) 
, & 
p= 7gP(l + BVz). (71) 


He was also able to express the new scalar and vector potentials (Ref. 90, p. 134) 


wy’ =p 
‘Al = —p'V' (72) 
whose components are (Ref. 90, p. 134) 
, & 
W = 2 (+ BAe) 
‘ k 
A =F (An + BY) 
(73) 
F 1 
AL = 7As 
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The new magnetic and electric fields in the moving frame thus satisfy the equations 
(Ref. 90, p. 134) 


/ 


_ dA! _ yw! 
dt’ ; (74) 


H =V'AA’ 


which led to the Maxwell equations rewritten in the form (Ref. 90, p. 135) 
dD’ dH’ 


7 +pVi=V'AH, Ail =-V'AD' 

(75) 
dp’ / v's / D’ = / 
ae -pV' =0, V’-D’=p 


The Maxwell equations are thus left invariant under the Lorentz transformation. 
Poincaré ended his proof by checking that the Lorentzian force is also left invariant, 
that is (Ref. 90, p. 135), 

p=pPD'+p'(V' AH). 
Poincaré thus obtained the invariance of Maxwell’s equations under Lorentz trans- 
formation, a result he was looking for at least since 1899 (he already tried to obtain 
it in his 1899 lecture). 

In order to complete his study, Poincaré wanted to show that the Lorentz trans- 
formation defined a group, an extremely important condition, according to his state- 
ments written in 1895.9” He thus introduced four infinitesimal generators (Ref. 90, 
p. 145) 


Th=x-V +05 

Liv a t-5 
ae ag 
Tat S424, 


corresponding to the temporal component and the three spatial coordinates, respec- 
tively. Poincaré then established that the form of the Lorentz equations are inde- 
pendent of the choice for axes. From this, he introduced a continuous group, that 
he named the “Lorentz group” , and which admits four infinitesimal transformations 


(Ref. 90, p. 146): 
(i) The transformation Ty which is permutable with all others; 


(ii) the three transformations T,, Ty, T.; 
(iii) the three rotations [T,, T,], (Ty, T=], [T., Tr]. 
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Any transformation of this group can always be decomposed into a 
scaling transformation of the form (Ref. 90, p. 146) 


{= 
a =le 
y=ly (76) 
geHoleg 


and a linear transformation that does not alter the quadratic form (Ref. 90, 
p. 146) 


Peg ee =P, (77) 


Poincaré also showed that this last quantity is invariant under the action of the 
group because it is annihilated by the infinitesimal generators To, T;, Ty, T,. He 
added that any transformation under the form (Ref. 90, p. 146) 

t= kl(t+ Ba) 

x’ = kl(a + Bt) 


Oy (78) 
2! 


=(12 


preceded and followed by a suitable rotation is a transformation of the Lorentz 
group. In his second theory, Lorentz proposed ! + 1 modulo a second order quantity. 
Poincaré showed that | = 1 is required for having a transformation belonging to 
the Lorentz group after a rotation by 7 around the y-axis, (Ref. 90, p. 163). 


8. Einstein’s 1905 Contribution 


The first contribution by Albert Einstein (1879-1955) to the theory of special rel- 
ativity was published in 1905 in the Annalen der Physik. In the Introduction, the 
principle of relativity is stated as follows (Ref. 98, pp. 891-892): 


The unsuccessful attempts to discover any motion of the Earth relatively to 
the “light-medium” suggest that the phenomena of electrodynamics as well 
as of mechanics possess no properties corresponding to the ideal of absolute 
rest. They suggest rather that, as has already been shown for the first order 
of small quantities, the same laws of electrodynamics and optics will be valid 
for all frames of reference for which the equations of mechanics hold good. 
We will raise this conjecture (the purport of which will hereafter be called 
the Principle of Relativity) to the status of a postulate, and also introduce 
another postulate, which is only apparently irreconcilable with the former, 
namely that light is always propagated in empty space with a definite 
velocity c which is independent of the state of motion of the emitting body. 
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These two postulates suffice for the attainment of a simple and consistent 
theory of the electrodynamics of moving bodies based on Maxwell’s theory 
for bodies at rest. The introduction of a “luminiferous ether” will prove to 
be superfluous inasmuch as the view here to be developed will not require an 
“absolutely stationary space” provided with special properties, nor assign 
a velocity-vector to a point of the empty space in which electromagnetic 
processes take place. 


The postulates used by Einstein to construct his theory are thus (i) the principle 
of relativity and (ii) the postulate according which the light velocity c is constant 
and enunciated as (Ref. 98, p. 895): 


Any ray of light moves in the “stationary” system of co-ordinates with the 
determined velocity c, whether the ray be emitted by a stationary or by a 
moving body. 


Einstein used in fact the electrodynamic theory developed by Hertz: He thus based 
his theory on the “kinematics of rigid bodies” — corresponding to the coordinate 
system — and which has to be compared to the “rigid system of bodies” used by 
Hertz (Ref. 54, p. 246 and also see the quotation in p. 40). 

Einstein needed to define what is meant by simultaneity and to explain how 
to get synchronous clocks as Poincaré did.®° He also introduced the idea that “the 
length of the moving rod measured from the stationary system [is] different from |. ..] 
the length of the rod in the moving system” (Ref. 98, p. 896): This is equivalent to 
the length contraction introduced by FitzGerald and Lorentz that we previously 
discussed. 

Einstein then investigated “the transformation of coordinates and time from a 
stationary system to another system which is in uniform motion of translation rel- 
atively to the former” (Ref. 98, p. 897). The unavowed aim of this section is to 
establish the transformation of coordinates which will leave invariant Maxwell’s 
equations when one switches from a frame at rest to a moving frame. Maxwell’s 
equations were established for describing electromagnetic phenomena. The coordi- 
nate transformation — the so-called Lorentz transformation — was obtained by 
Lorentz by investigating these equations. In contrast to this, Einstein only con- 
sidered one of the coordinate systems “k” which “has a constant velocity [V] in 
the direction of the x-axis of the other which is a stationary system “K” (Ref. 98, 
p. 897). He defined the relative position of the two systems as follows (Ref. 98, 
p. 897): 


Any time t of the stationary system “K” corresponds to a definite position 
of the axes of the moving system, which are always parallel to the axes of 
the stationary system. By t, we always mean the time in the stationary 
system. 
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A point in the stationary system “K” corresponds to a point in the moving system 
“k” according to (Ref. 98, p. 898) 


any system of values (x,y, z,t) which completely defines the position and 
time of an event in the stationary system, there corresponds a system of 
values (x’, y’, 2’, t’) determining that event relatively to the system “k”, and 
our task is now to find the system of equations connecting these quantities. 


In order to obtain these equations, Einstein stated that “if€ = «—Vt, it is clear 
that a point at rest in the system “k” must have a system of values (€,y,z) which 
are independent of time.” (Ref. 98, p. 898). In this expression, V is the velocity of 
system “k” with respect to the stationary system “K” and € is a given position along 
the x’-axis in the moving system “k”. After some unclear considerations leading to 
(Ref. 98, p. 899) 


a Vt | 
dE C2 —V2 Ot 


0, (79) 


Einstein expressed time t’ in the moving system “k” as (Ref. 98, p. 899) 


: 
v=a(+-"4,), (80) 


where “a is a function y(V) at present unknown, and where for brevity it is assumed 
that at the origin of “k”, t' =0 when t = 0” (Ref. 98, p. 899). 

Einstein then proposed a coordinate transformation between the resting system 
“K” and the moving system “k” under the form (Ref. 98, p. 900): 


a! = y(V)y(e Vt), (81) 
y’ = (Vy 
z= (V)z 


where y(V) is not yet known and y = is the contraction coefficient intro- 


v2 
1-4 


duced by Lorentz. Once he showed that y(V) = 1, he obtained the final coordinate 
transformation (Ref. 98, p. 902) 


v=ya-Vt) . (82) 
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He then wanted to prove that (Ref. 98, p. 901) 


any ray of light, measured in the moving system, is propagated with the 
velocity c, if, as we have assumed, this is the case in the stationary system; 
for we have not as yet furnished the proof that the principle of the constancy 
of the velocity of light is compatible with the principle of relativity. 


However, Einstein already used it when he wrote expression (80) for time ¢’ in the 
moving system “k”; indeed, he previously wrote that (Ref. 98, p. 899) 


light (as required by the principle of the constancy of the velocity of light, 
in combination with the principle of relativity) is also propagated with 
velocity c when measured in the moving system. 


Indeed, he explicitly already used x = ct and x’ = ct’ for getting his coordinate 
transformation. We have thus necessarily 


eta bet oer (83) 
and 
yg! oe y” ne 7/2 = ct’. (84) 


The second part of Einstein’s paper is devoted to electrodynamics and, more par- 
ticularly, to the “transformation of the Mazwell—Hertz equations for empty space” 
(Ref. 98, p. 907). The equations actually used by Einstein are those (with a cor- 
rection in the minus sign which he applied to Hertz’s equation (13) as it is done 
today and not to Eq. (12)) proposed by Hertz (Ref. 50, p. 138) (see Eqs. (12) 
and (13), p. 35 of this paper) for describing electromagnetic fields in the ether: One 
can therefore question in which sense the ether is superfluous in Einstein’s theory. 
Indeed Hertz’s equations (12) and (13) that Einstein used were written for etherous 
molecules and not for bodies at rest or in motion. 

Applying his coordinate transformation to Hertz’s equations, Einstein then 
obtained — without any detail — the corresponding equations in a coordinate sys- 
tem moving with a velocity V with respect to the resting system, that is (Ref. 98, 


p. 907), 
1 OE, 0 V 0 LV 
c Ot! ~ VOy! (#. ny) Y@z! Ge ne.) 
V 


I-76 V. Messager and C. Letellier 


and (Ref. 98, p. 908) 


10H, 7) V ) 4 
= E HH E,4+—H 
c at ae! ( ve ) TAy! ( c ,) 
V V Ht) OE, (86) 
c 

1 0 V OEx 0 4 
-y— | H,-—E,)= E H, 
c! at! ‘ c ) Oy’ ‘Ax! ( 2 ) 

Then using the principle of relativity requiring that “the Maxwell—Hertz equations 


for empty space (sic) hold in system “K”, they also hold in system “k” (Ref. 98, 
p. 908). In system “k”, Einstein should get 


1 OE’ 
: a =V'AH’ (87) 
and 
1 OH’ 
a ae =-V'AE’. (88) 


Einstein then identified term to term these last two equations with Eqs. (85) and 
(86). In other words, he just wrote what should be the transformed electric and 
magnetic fields — similar to those found by Poincaré (Ref. 90, p. 135) — for verify- 
ing that his “Maxwell—Hertz equations” obey to the principle of relativity. The key 
point is to switch from the original Maxwell—Hertz equations to the transformed 
Eqs. (85) and (86) about which there is no indication how Einstein got them (the 
application of the coordinate transformation to the electric and magnetic fields is 
not straightforward at all). 


9. Conclusion 


Special relativity clearly came from the description of optical phenomena in moving 
bodies. Such a problem was investigated at least for 200 years before 1905. Investi- 
gating the various descriptions of these phenomena led us to show that all theories 
were dual, combining particles and waves. When a wave theory is required for 
describing the experimental observations, this is always a mechanical explanation 
in terms of light particles interacting with etherous molecules which was proposed 
if provided. It is interesting to note that all optical phenomena are well described 
without any mention to the ether but were never explained without it. We also 
showed that, contrary to what Einstein claimed, the ether is not removed from his 
1905 paper since he used the equations written by Hertz for the electromagnetic 
fields in the ether. At many times, the ether blurred the discussion but, if it can be 
easily omitted for describing the experimental facts, one still has many difficulties 
to explain how light is propagated (or what a photon is). 
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As shown by Poincaré, the electromagnetic theory describing Michelson and 
Morley’s experiment is the theory proposed by Maxwell (and not the one by 
Hertz). Poincaré thus showed, by proposing the correct addition law for veloci- 
ties the invariance of Maxwell’s equations when one switches from a resting frame 
to a moving frame or, more exactly, between two frames presenting a uniform 
translation between each other. Various attempts were also published by Voigt, 
Larmor, Lorentz and Einstein. Only Lorentz and Poincaré clearly worked in a four- 
dimensional formulation by using the vector- and the scalar-potential for describing 
the electromagnetic fields. 
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Appendices 
A.l. Fizeau’s experiments 


In order to check Fresnel’s theory, Hippolyte Fizeau (1819-1896) conducted an 
experiment to determine how the ether could be entrained. He retained three main 
hypotheses®®: 


(i) The ether adheres, as it was fixed to the molecules of the body and, conse- 
quently, shares motions that may be imposed on this body. 
(ii) The ether is free and independent, and is not driven by the body in its motion. 
(iii) A portion of the ether would be free, while the other portion would be fixed 
to the molecules of the body and would solely share in its motion. 


The experiment conducted by Fizeau is sketched in Fig. A.1. The fringe shift 
resulting from the experiment was Ap = 0.4. Fizeau then compared this value to 
the fringe shift which should result from the different hypotheses on the state of the 
ether. Depending on the assumption retained, the expected shift of fringes Ap is 
more or less large. The first hypothesis (Ap = 0.92) corresponds to a total driving of 
the ether by the motion of the Earth: The light velocity is thus deeply affected. The 
second hypothesis (Ap = 0) is associated with no driving and the light velocity is 
constant. Finally, the third hypothesis (Ap = 0.46) corresponds to a partial driving 
and the light velocity is partly affected. Fizeau’s result thus validated Fresnel’s 
hypothesis for a partial driving of the ether by the Earth. As a consequence, the 
motion of transparent bodies induces a change in the light velocity according to 
their refractive property. 


A.2. Michelson and Morley’s experiments 


The first experiment to determine the relative motion between the Earth and the 
ether using light propagation was suggested by Maxwell in 1878 in an article entitled 
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Fig. A.1l. Sketch of the experimental setup conducted by Fizeau. 


Aether for the Encyclopedia Britannica (Ref. 99, p. 270). 


If it was possible to determine the light velocity by observing the time 
it takes to travel between one station and another on Earth’s surface, we 
might, by comparing the observed velocities in opposite directions, deter- 
mine the velocity of aether with respect to these terrestrial stations. All 
methods, however, by which it is practicable to determine the velocity of 
light from terrestrial experiments depend on the measurement of the time 
required for the double journey from one station to the other and back 
again, and the increase of this time on account of a relative velocity of the 
ether equal to that of the Earth in its orbit would be only about one hun- 
dred millionth part of the whole time of transmission, and would therefore 
be quite insensible. 


Although Maxwell was not too confident in the success of his experiment since 
the difference was depending on the squared ratio of velocities, Albert Michelson 
(1852-1931) built a similar experiment!?: 


If, therefore, an apparatus is so constructed as to permit two pencils of light, 
which have traveled over paths at right angles to each other, to interfere, 
the pencil which have traveled in the direction of the Earth’s motion, will in 
reality travel i of a wavelength farther than it would have done, were the 
Earth at rest. The other pencil being at right angles to the motion would 
not be affected. If now, the apparatus be revolved through 90° so that the 
second pencil is brought into the direction of the Earth’s motion, its path 
will have lengthened — wave-lengths. The total change in the position of 
the interference bands would be ate of the distance between the bands, a 


quantity easily measurable. 


But Michelson’s first results did not show any fringe shift and refuted George 


Stokes’s hypothesis for a stationary ether.!°! Considering that his experimental 
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Fig. A.2. Sketch of the experimental device conducted by Michelson and Morley in 1887. (For 
color version, see page I-CP1.) 


errors were sufficient to mask the fringe shift, Michelson built a second experiment 
with Edward Morley (1838-1923). The same device (Fig. A.2) was placed on a 
massive stone floating on a mercury bath in order to remove any vibrating pertur- 
bation. They also augmented by ten times the length of the two optical paths (11m 
in length) by multiple reflections on the mirrors.°? 

In spite of the great care spent to build this second experiment, the result was 
the same as for the first one: No fringe shift. Michelson and Morley thus concluded 
(Ref. 59, p. 341): 


It appears, from all that precedes, reasonably certain that, if there be any 
relative motion between the Earth and the luminiferous ether, it must be 
small; quite small enough entirely to refute Fresnel’s explanation of aber- 
ration. Stokes has given a theory of aberration which assumes the ether at 
the Earth’s surface to be at rest with regard to the latter, and only requires 
in addition that the relative velocity have a potential; but Lorentz shows 
that these conditions are incompatible. Lorentz then proposes a modifica- 
tion which combines some ideas of Stokes and Fresnel, and assumes, the 
existence of a potential, together with Fresnel’s coefficient. If now it were 
legitimate to conclude from the present work that the ether is at rest with 
regard to Earth’s surface according to Lorentz there could not be a velocity 
potential, and his own theory also fails.” 
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that the sole hypothesis to explain Michelson and Morley’s experimen 
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In 1889, George FitzGerald (1851-1901), George Stoney’s nephew, suggested 
4102 


[...] is that the length of material body changes, according as they are 
moving through the ether or across it, by an amount depending on the 
square of the ratio of their velocity to that of light. We know that electric 
forces are affected by the motion of the electrified bodies relative to the 
ether, and it seems a not improbable supposition that the molecular forces 
are affected by the motion, and that the size of a body alters consequently. 


Such a contraction of the length was enough to explain the experimental facts. 
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This short exposition starts with a brief discussion of situation before the completion 
of special relativity (Le Verrier’s discovery of the Mercury perihelion advance anomaly, 
Michelson—Morley experiment, Edtvés experiment, Newcomb’s improved observation of 
Mercury perihelion advance, the proposals of various new gravity theories and the devel- 
opment of tensor analysis and differential geometry) and accounts for the main concep- 
tual developments leading to the completion of the general relativity (CGR): gravity has 
finite velocity of propagation; energy also gravitates; Einstein proposed his equivalence 
principle and deduced the gravitational redshift; Minkowski formulated the special rel- 
ativity in four-dimentional spacetime and derived the four-dimensional electromagnetic 
stress—energy tensor; Einstein derived the gravitational deflection from his equivalence 
principle; Laue extended Minkowski’s method of constructing electromagnetic stress- 
energy tensor to stressed bodies, dust and relativistic fluids; Abraham, Einstein, and 
Nordstr6m proposed their versions of scalar theories of gravity in 1911-13; Einstein and 
Grossmann first used metric as the basic gravitational entity and proposed a “tensor” 
theory of gravity (the “Entwurf” theory, 1913); Einstein proposed a theory of gravity 
with Ricci tensor proportional to stress-energy tensor (1915); Einstein, based on 1913 
Besso—Einstein collaboration, correctly derived the relativistic perihelion advance for- 
mula of his new theory which agreed with observation (1915); Hilbert discovered the 
Lagrangian for electromagnetic stress—energy tensor and the Lagrangian for the gravi- 
tational field (1915), and stated the Hilbert variational principle; Einstein equation of 
GR was proposed (1915); Einstein published his foundation paper (1916). Subsequent 
developments and applications in the next two years included Schwarzschild solution 
(1916), gravitational waves and the quadrupole formula of gravitational radiation (1916, 
1918), cosmology and the proposal of cosmological constant (1917), de Sitter solution 
(1917) and Lense-Thirring effect (1918). 
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1. Prelude — Before 1905 


General Relativity (GR) was fast in its acceptance in the world community. This was 
not the case for Newtonian gravitation.! We quote from the beginning of Chapter V 
on Gravitation of Vol. II from Whittaker”: “We have seen (cf. Vol. I, pp. 29-31) that 
for many years after its first publication, the Newtonian doctrine of gravitation was 
not well received. Even in Newton’s own University of Cambridge, the textbook of 
physics in general use during the first quarter of the 18th century was still Cartesian: 
while all the great mathematicians of the Continent — Huygens in Holland, Leibnitz 
in Germany, Johann Bernoulli in Switzerland, Cassini in France — rejected the 
Newtonian theory altogether”. 

“This must not be set down entirely to prejudice: many well-informed 
astronomers believed, apparently with good reason, that the Newtonian law was 
not reconcilable with the observed motions of the heavenly bodies. They admit- 
tedly that it explained satisfactorily the first approximation to the planetary orbit, 
namely that they are ellipses with the sun in one focus: but by the end of seventeenth 
century much was known observationally about the departures from elliptic motion, 
or inequalities as they are called, which were presumably due to mutual gravita- 
tional interaction: and some of these seemed to resist every attempt to explain them 
as consequences of the Newtonian law”. 

The most serious one was the Great inequality of Jupiter and Saturn. In the 
same page, Whittaker continued: “A comparison of the ancient observations cited 
by Ptolemy in the Almagest with those of the earlier astronomers of Western Europe 
and their more recent successors, showed that for centuries past the mean motion, or 
average angular velocity round the sun, of Jupiter, had been continually increasing, 
while the mean motion of Saturn had been continually decreasing”. According to 
Kepler’s® third law, the orbit of Jupiter must be shrinking and the orbit of Saturn 
must be expanding. This stimulates the development of celestial mechanics. Euler 
and Lagrange made significant advances. In 1784, Laplace found that the Great 
inequality is not a secular inequality but a periodic inequality of 929-year long period 
due to nearly commeasurable orbital periods of Jupiter and Saturn. Calculation 
agreed with observations. The issue was completely solved. For a more thorough 
study of the history of the Great inequality of Jupiter and Saturn, see the doctoral 
thesis of Curtis Wilson.* 

In 1781, Herschel discovered the planet Uranus. Over years, Uranus persistently 
wandered away from its expected Newtonian path. In 1834, Hussey suggested that 
the deviation is due to perturbation of an undiscovered planet. In 1846, Le Verrier 
predicted the position of this new planet. On 25, September 1846, Galle and d’ Arrest 
found the new planet, Neptune, within one degree of arc of Le Verrier’s calculation. 
This symbolized the great achievement of Newton’s theory.® 

With the discovery of Neptune, Newton’s theory of gravitation was at its peak. 
As the orbit determination of Mercury reached 1078, relativistic effect of gravity 
showed up. In 1859, Le Verrier discovered the anomalous perihelion advance of 
Mercury.® 
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Anomalous perihelion advance of Mercury. In 1840, Arago suggested to Le Ver- 
rier to work on the subject of Mercury’s motion. Le Verrier published a provisional 
theory in 1843. It was tested at the 1848 transit of Mercury and there was not close 
agreement. As to the cause, Le Verrier’ 
the principle of gravitation have not be deduced in many particulars with a sufh- 
cient rigor: we will not be able to decide, when faced with a disagreement between 
observation and theory, whether this results completely from analytical errors or 
whether it is due in part to the imperfection of our knowledge of celestial physics” .“° 

In 1859, Le Verrier® published a more sophisticated theory of Mercury’s motion. 
This theory was sufficiently rigorous for any disagreement with observation to be 
taken quite confidently as indicating a new scientific fact. In this paper, he used two 
sets of observations — a series of 397 meridian observations of Mercury taken at the 
Paris Observatory between 1801 and 1842, and a set of observations of 14 transits 
of Mercury. The transit data are more precise and the uncertainty is of the order of 
1”. The calculated planetary perturbations of Mercury are listed in Table 1.°° In 
addition to these perturbations, there is a 5025” /century general precession in the 
observational data due to the precession of equinox. The fit of observational data 
with theoretical calculations has discrepancies. These discrepancies turned out to 
be due to relativistic-gravity effects. Le Verrier attributed these discrepancies to an 
additional 38” per century anomalous advance in the perihelion of Mercury.’ 

Newcomb? in 1882, with improved calculations and data set, obtained 42”.95 
per century anomalous perihelion advance of Mercury. The value more recently 
(1990) was (42.98 + 0.04)/century.'° At present, ephemeris fitting reached 107+ 
precision. See Ref. 11 and references therein. 

Michelson—Morley experiment. According to Newton’s second law of motion and 
Galilean transformation, light velocity would change in a moving frame. However, 
this is not the experimental finding of Michelson and Morley in 1887!*: “Considering 
the motion of the earth in its orbit only, this displacement should be 2Dv?/V? = 
2D x 107-8. The distance D was about eleven meters, or 2 x 10’ wavelengths of 
yellow light; hence the displacement to be expected was 0.4 fringe. The actual 
displacement was certainly less than the twentieth part of this kind, and probably 
less than the 40th part. But since the displacement is proportional to the square of 
the velocity, the relative velocity of the earth and the ether is probably less than one 
sixth the earth’s orbital velocity, and certainly less than one-fourth”. D is the optical 


wrote “Unfortunately, the consequences of 


Table 1. Planetary perturbations 
of the perihelion of Mercury.®® 


Venus 280” .6/century 
Earth 83” .6/century 
Mars 2” .6/century 
Jupiter 152’ .6/century 
Saturn 7" .2/century 
Uranus 0” .1/century 


Total 526””.7/century 
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path length in one arm of the multi-reflection Michelson—Morley interferometer 
sat on the granite floating in liquid mercury; v is the velocity of earth relative 
to ether; V is the light velocity. In modern Michelson—Morley experiments, one 
measures the frequency changes Av/v of two perpendicular Fabry—Perot cavities. 
The most precise experiment by Nagel et al.!> measured the changes Av/v of two 
cryogenic cavities to be (9.2+ 10.7) x 10~'° (95% confidence interval), a nine order 
improvement to the original Michelson—Morley experiment. 

Eétvés experiment. In 1889, Eétvés'4 used a torsion balance with different types 
of sample materials to significantly improve on the test of the Galileo equivalence 
principle (the equivalence of gravitational mass and inertial mass; the universality 
of free fall)!° to a precision of 1 in 20 million (5 x 1078). The most recent terrestrial 
experiments of Washington group used torsion-balance to compare the differential 


accelerations of beryllium—aluminum and beryllium-titanium test-body pairs with 
precisions at the part in 10! level and confirmed the Galileo equivalence principle.!® 
The first space experiment Microscope (MICRO-Satellite a trainée Compensée pour 
l’Observation du Principle d’Equivalence)!7!8 has been in orbit since 26 April, 2016 
with the aim of improving the test accuracy to one part in 10!° level and is perform- 
ing functional tests successfully.!* The Microscope test masses are made of alloys 
of Platinum—Rhodium (PtRh10 — 90% Pt, 10% Rh) and Titanium—Aluminum-— 
Vanadium (TA6V — 90% Ti, 6% Al, 4% V), while the REF test masses are made 
of the same PtRh10 alloy. The weak equivalence for photons are confirmed with 
precisions at the part in 10°° level in astrophysical and cosmological observations 
on electromagnetic wave propagation.!? 

The discovery of Mercury perihelion advance anomaly undermined Newton’s 
gravitation theory while the null results of Michelson and Morley undermined the 
Galilean invariance and Newton's dynamics. The foundation of Newton’s world sys- 
tem and classical physics needed to be replaced. The precise verification of weak 
equivalence principle and realization that the phenomena are the same in a uni- 
formly moving boat and on ground made it easier to advance one step in cognition to 
comprehend and formulate Einstein Equivalence Principle (EEP) (the phenomena 
in a falling elevator are the same as in free space). 

In the last half of the 19th century, efforts to account for the anomalous perihe- 
lion advance of Mercury explored two general directions: (i) searching for a puta- 
tive planet ‘Vulcan’ or other matter inside Mercury’s orbit; and (ii) postulating an 
ad hoc modified gravitational force law. Both these directions proved unsuccessful. 
Proposed modifications of the gravitational law included Clairaut’s force law (of 
the form A/r? + B/r*), Hall’s hypothesis (that the gravitational attraction is pro- 
portional to the inverse of distance to the (2+ 6) power instead of the square), and 
velocity-dependent force laws. The reader is referred to Ref. 8 for a thorough study 
of the history related to the Mercury’s perihelion advance. 

A compelling solution to this problem had to await the development of GR. 
When GR is taken as the correct theory for predicting corrections to Newton’s 
theory, we understand why when the observations reached an accuracy of the order 
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of 1” per century (transit observations), a discrepancy would be seen. Over a cen- 
tury, Mercury orbits around the Sun 400 times, amounting to a total angle of 5 x 108 
arcsec. The fractional relativistic correction (perihelion advance anomaly) of Mer- 
cury’s orbit is of order Gy Mgun/ dc? with d being the distance of Mercury to the 
Sun and € a parameter of order one depending on theory; for GR with € = 3, it 
is 8 x 10-8. Therefore, the general relativistic correction for perihelion advance is 
about 40 arc sec per century. As the orbit determination of Mercury reached an 
accuracy of order 10~8, the relativistic corrections to Newtonian gravity became 
manifest. 

We thus see how gravitational anomalies can lead either to the discovery of 
missing matter or to a modification of the fundamental theory for gravity. 

It is not totally coincidental that Le Verrier not only predicted the position 
of a new planet, but also discovered the Mercury perihelion advance anomaly as 
astronomical observation were refined and accumulated for a century. 

Michelson—Morley experiment inspired the consideration of new covariant for- 
mulation of electromagnetism under reference frame transformation. Michelson— 
Morley experiment, with various proposals and developments, led eventually to the 
approximate transformation theory of Lorentz,?° and the principle of relativity of 
Poincaré.?! 3 In 1901, Poincaré?* performed a rigorous mathematical and physical 
analysis of various variants of the electrodynamic theory; in the introduction, he 
wrote (English translation from pp. 48-49 of Ref. 25): “Although none of these the- 
ories seems to me fully satisfactory, each one contains without any doubt a part of 
the truth and comparing them maybe instructive. From all of them, Lorentz theory 
seems to me the one which describes in the better way the facts”. What Poincaré 
used as a criterion of satisfaction is whether the principle of relative motion is fully 
satisfied (Ref. 21, p. 477): The movement of any system whatever ought to obey 
the same laws, whether it is referred to fixed axes or to the movable axes which 
are implied in uniform motion in a straight line (English translation from p. 63 
of Ref. 25). This is clearly the invariance of the laws under a change of reference 
frame, when one is related to the other by a constant velocity.?° Nevertheless, in 
1900, the transformation from one frame to the other was not known. The “clas- 
sical” composition law for velocities was clearly not working for explaining optical 
experiments such as Michelson and Morley’s.?° 

In 1902, Poincaré called the principle of relative motion as the principle of 
relativity.??-?3 In 1904, Poincaré gave a talk entitled L’ état actuel et l'avenir de la 
physique mathématique to the scientific congress at the Saint Louis World Fair and 
stated the Principle of Relativity???6 as “The laws of physical phenomena must be 
the same for a fixed observer and for an observer in rectilinear and uniform motion 
so that we have no possibility of perceiving whether or not we are dragged in such 
a motion’. In the same year, Lorentz*° formulated an approximate transformation 
theory which satisfied the principle of relativity and agreed with all the experiments 
to their precision at that time. 
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27 arrived at the exact invari- 


In 1905, using the principle of relativity, Poincaré 
ant transformation (Poincaré called it the Lorentz transformation) and completed 
the transformation theory of special relativity. In a subsequent paper, Einstein?® 
also arrived at the exact Lorentz transformation and completed the transforma- 
tion theory of special relativity. Thus, the special theory of relativity was born. 
For a more complete study of the history of the development of special theory of 
relativity, we refer the readers to Messager and Letellier’s review.?° 

In addition to the transformation theory of special theory of relativity, Einstein® 
made a cognition advance in postulating and ascertaining the general mass—energy 
equivalence relation: E = mc?. For a brief history of the genesis of the mass-energy 
equivalence relation, we quote Whittaker (pp. 51-52 of Ref. 2): 

“We have now to trace the gradual emergence of one of the greatest discoveries 
of the twentieth century, namely, the connection of mass and energy”. 

“As we have seen,! Thomson in 1881 arrived at the result that a charged spher- 
ical conductor moving in a straight line behaves as if it had an additional mass of 
amount (4/3c”) times the energy of its electrostatic field.? In 1900 Poincaré,? refer- 
ring to the fact that in free aether the electromagnetic momentum is (1/c?) times 
the Poynting flux of energy, suggested that electromagnetic energy might possess 
mass density equal to (1/c?) times the energy density: that is to say, E = mc? 
where F is energy and m is mass: and he remarked that if this were so, then 
a Hertz oscillator, which sends out electromagnetic energy preponderantly in one 
direction, should recoil as a gun does when it is fired. In 1904, Hasenohrl* (1874— 
1915) considered a hollow box with perfectly reflecting walls filled with radiation, 
and found that when it is in motion there is an (continued to next page) apparent 
addition to its mass, of amount (8/3c”) times the energy possessed by the radiation 
when the box is at rest: in the following year! he corrected this to (4/3c?) times 
the energy possessed by the radiation when the box is at rest?; that is, he agreed 
with Thomson’s EF = (3/4)mc? rather than with Poincaré’s E = mc?. In 1905, 
Einstein® asserted that when a body is losing energy in the form of radiation its 
mass is diminished approximately (i.e. neglecting quantities of the fourth-order) by 
(1/c?) times the energy lost. He remarked that it is not essential that the energy 
loss by the body should consist of radiation, and suggested the general conclusion, 
in agreement with Poincaré, that the mass of a body is a measure of its energy 
content: if the energy changes by E ergs, the mass changes in the same sense by 
E/c? grams. In the following year he claimed* that this law is the necessary and 
sufficient condition that the law of conservation of motion of the center of grav- 
ity should be valid for systems in which electromagnetic as well as mechanical 
processes are taking place”. (We refer the readers to Ref. 2 for footnotes and refer- 
ences in the quotation except noting that the Einstein’s two references are Refs. 29 
and 30 and that further studies of Fermi (1922), Wilson (1936), von Mosengeil 
(1907) and Planck (1907) corrected both cases with E = (3/4)mc? to agree with 
B=me) 
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Table 2. Historical steps toward synthesis of a new theory of gravity (post Newtonian 
theory) before the genesis of special relativity in 1905. 


Year Reference Historical step 
1859 Le Verrier® Discovery of Mercury perihelion advance anomaly 
1882 Newcomb? Improved measurement of the Mercury perihelion 
advance anomaly 
1887 Michelson and Morley!? Michelson—Morley experiment 
1889 Eétvos!4 E6tvés experiment to test WEP to 10-8 level 
1864 on See, e.g. Roseveare® The proposals of various new gravity theories 
1854-1900 Riemann,®? Klein,?> The development of differential geometry and tensor 
Ricci and Levi-Civita?® analysis 
1887-1904 Lorentz,?° and various Approximate transformation theory of special 
authors relativity 
1900-1904 Poincaré?!~23 Principle of relativity 
1905 Poincaré,?” Einstein?8 Exact transformation theory of special relativity 
1905 Einstein?9 E = mc? in special relativity 


Further developments in special relativity. The development of special relativity 
continued after 1905. Planck in 1906°! obtained the relativistic formulas of kinetic 
energy and momentum of a material particle. Minkowski in 1907 derived the four- 
dimensional covariant formulation of the Maxwell’s equations together with the 
four-dimensional stress-energy tensor of electromagnetic field.*? We will address 
more of these developments relevant to the genesis of GR. 

Differential geometry and tensor calculus. In 1854, Riemann*? founded Rie- 
mannian geometry. Metric was the fundamental entity in Riemannian geometry. 
Christoffel’s* introduced covariant differentiation. In the 1872 Erlangen program, 
Klein®® first gave a generalized definition of geometry and cleared indicated the 
essential nature of a vector under the group of rotations of orthogonal axes in 
three-dimensional space. Various authors*® °° drew attentions to symmetric ten- 
sors of rank 2, scalars and tensors of rank 2. From 1887 onwards, Ricci-Curbastro 
generalized the theory to tensor calculus for transformations in curved space of 
any dimensions. It became widely known when Ricci (Ricci-Curbastro) and Levi- 
Civita*® published their memoir describing it in 1900. These developments greatly 
facilitated the development of GR. 

Table 2 lists important historical steps toward synthesis of a new theory of 
gravity (post Newtonian theory) agreeing with experiment /observation before the 
genesis of special relativity in 1905. 


2. The Period of Searching for Directions and New 
Ingredients: 1905-1910 


The genesis of GR can be roughly divided into 3 periods: (i) 1905-1910, the period 
of searching for directions and ingredients; (ii) 1911-1914, the period of various 
trial theories; (iii) 1915-1916, the synthesis and consolidation. In the prelude we 
have seen that Newton’s gravitation theory needs to be replaced. In this section, 
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we first discuss some ingredients of it followed by searching for directions and new 
ingredients towards genesis of a new gravitation theory. 

Ingredients of Newton’s theory. Newton’s theory of gravity is an inverse law 
with active gravitational mass proportional to passive gravitational mass and active 
gravitational mass also proportional to inertial mass. With appropriate choice of 
units, the gravitational force F)_.2 acting on body 2 from body 1 can be written in 
the form 
ni2 


Fi2 = GumaiMp2 = Miner2@2, (1) 


T12 
where mai is the active gravitational mass of body 1, mp2 the passive gravitational 
mass of body 2, nj2 the unit vector from body 2 to body 1, rig the distance 
between body 1 and body 2, minerg the inertial mass of body 2, ag the acceleration 
of body 2 and Gy the universal Newton constant. The Galileo weak equivalence 
principle dictates the equality of passive gravitational mass and the inertial mass, 
i.e. Mp = Miner = mM while Newton’s third law of motion dictates the equality of 
passive gravitational mass and the active gravitational mass, i.e. Mp = Ma = M. 
Hence, (1) becomes 
N12 


F\_.2 = Gummy = mag. (2) 


12 
The action is instant. In Newton’s original form the theory is an action-at-a-distance 
theory. In potential theory form, the gravitational potential ®(x,t) for a mass dis- 
tribution p(x, t) satisfies the Poison equation: 


V?®(x, t) = 47Gnp(x, t). (3) 


The left-hand side of (3) depends on the gravitational field while the right-hand 
side depends on the gravitating source. In the field approach, to reach a new theory 
of gravity we may need to replace both the left-hand side and right-hand side. 
Finite velocity of propagation. It is natural for Poincaré who reached the exact 
transformation theory in agreement with the principle of relativity to also think 
about how to reconcile gravity. Poincaré?"*! pointed out that for principle of rel- 
ativity to be true, gravity must be propagated with speed of light, and mentioned 
gravitational-wave propagating with the speed of light based on Lorentz invariance. 
He attempted to formulate an action-at-a-distance theory of gravity with finite 
propagation velocity compatible with principle of relativity, but was unsuccessful. 
All energy must gravitate. As we mentioned in the last section, Planck*! obtained 
the relativistic formulas of kinetic energy and momentum of a material particle in 
1906. Since energy is equivalent to mass and has inertia, it must gravitate according 
to the equivalence of the inertia mass and the gravitational mass which was verified 
to great precision by Eétvés experiment. Hence, Planck*? postulated that all energy 
must gravitate in 1907 and made another step toward a new theory of gravity. 
EEP. Einstein,*? in the last part (Principle of Relativity and Gravitation) of 
his Comprehensive 1907 essay on relativity, proposed the complete physical equiv- 
alence of a homogeneous gravitational field to a uniformly accelerated reference 
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system: “We consider two systems of motion, }; and “2. Suppose J; is accelerated 
in the direction of its X-axis, and y is the magnitude (constant in time) of this 
acceleration. Suppose 2 is at rest, but situated in a homogeneous gravitational 
field, which imparts to all objects an acceleration —¥y in the direction of the X-axis. 
As far as we know, the physical laws with respect to ©; do not differ from those 
with respect to N2, this derives from the fact that all bodies are accelerated alike in 
the gravitational field. We have therefore no reason to suppose in the present state 
of our experience that the systems %; and ‘2 differ in any way, and will therefore 
assume in what follows the complete physical equivalence of the gravitational field 
and the corresponding acceleration of the reference system”. 

From this equivalence, Einstein derived clock and energy redshifts in a gravita- 
tional field. The reasoning is clear and simple: two observers at different location 
of the uniform gravitational field can be equivalently considered in an accelerated 
frame. In the equivalent accelerated frame there are Doppler shift. This gives red- 
shift /blueshift in the gravitational field. When applied to a spacetime region where 
inhomogeneities of the gravitational field can be neglected, this equivalence dic- 
tates the behavior of matter in gravitational field. The postulate of this equivalence 
is called the EEP. EEP is the cornerstone of the gravitational coupling of matter 
and nongravitational fields in GR and in metric theories of gravity. EEP fixes local 
physics to be special relativistic. 

Local physics in Newtonian gravity also observed this equivalence principle for- 
mally except here the local physics is Newtonian mechanics, not special relativity 
(Here the transformation to the accelerated frame is through a non-Galilean trans- 
formation. See, e.g. Ref. 19 and references therein for details.). 

Four-dimensional spacetime formulation and the Minkowski metric. On 21 
December 1907, Minkowski read before the Academy “Die Grundgleichungen fiir die 
elektromagnetischen Vorginge in bewegten K6rpern” (The fundamental equations 
for electromagnetic processes in Moving bodies)?” (See also Ref. 44). In this paper, 
Minkowski put Maxwell equations into geometric form in four-dimensional space- 
time with Lorentz covariance using Cartesian coordinates x, y, z and imaginary time 
it and numbering them as 71 = 2,22 = y, ©3 = z and 24 = it. Minkowski defined 
the four-dimensional excitation in terms of D and H, and the four-dimensional field 
strength in terms of EF and B. 

Maxwell equations in Minkowski form was soon written in integral form by 
Hargreaves*° and devoted a detailed investigation by Bateman*® and Kottler.*” 

In 1909, Bateman*® worked on the electrodynamic equations. He used time coor- 
dinate t instead of x4, and studied integral equations and the invariant transforma- 
tion groups. He considered specifically transformations that leave the invariance of 
the differential (form) equation: 


(de)’ + (dy)? + (dz)? — (dt)? =0 (4) 


and included conformal transformations in addition to Lorentz transformations, 
therefore he went one step forward toward general coordinate invariance. He did 
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use more general (indefinite) metric from coordinate transformations in his study 
of electromagnetic equation. 
With the definition 2! = x, x? = y, 3 = z and r® =t, Eq. (4) can be written as 


(dx*)? + (dx)? + (dx*)? — (dx°)? = —njda'dr’ = 0, (5) 


where the Minkowski metric 7; is defined as 


1 0 O 0 
0 -1 O 0 

Nkl = (6a) 
0 oO -1 O 
0 O O -1 

with its inverse 7*! 

1 0 O 0 
0 -1 O O 

= (6b) 
0 oO -1 O 


0 0 0 -l 


In (5) and this paper, we use Einstein convention of summing over repeated indices. 
Minkowski metric is used in raising and lowering covariant and contravariant indices 
in special relativity. 

With indefinite metric, one has to distinguish covariant and contravariant ten- 
sors and indices. Aware of this, one can readily put Maxwell equations into covari- 
ant form without using imaginary time. Following Minkowski®? but using real time 
coordinate, in terms of Minkowski four-dimensional field strength Fi, (£,B) and 
four-dimensional excitation (density) HY (D, H) 

0 Fy Ex E3 


—E, 0 — Bz Bo 
Fry = ; (7a) 


HY = (7b) 


D3; —-H. My 0 
Maxwell equations can be expressed in Minkowski form as 
HY ;=—4rJ', (8a) 
gr. = 0, (8b) 
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where J* is the charge 4-current density (p,J) and e’*' the completely anti- 


symmetric tensor density (Levi-Civita symbol) with e°!73 = 1. “,” 


means partial 
derivation. In vacuum, the relation of Minkowski four-dimensional field strength 
Fy, (E, B) and four-dimensional excitation (density) HY (D, H) is 
HY = fr Fa = 5 (0! — Faye, HT = BM (9) 
Four-dimensional electromagnetic stress-momentum—energy tensor. In the same 
paper, Minkowski®? derived the four-dimensional electromagnetic stress-momen- 
tum-energy (or stress-energy or energy-momentum) tensor T(™) / of rank 2: 


1 ; 1 
TEM) J = |— 6,2 Fy F™ <= FyF", (10) 
: 167) ° 4a 
with 
1 
ren — (2 \ iB) + (By (11) 
87 
the electromagnetic energy density discovered by W. Thomson (Kelvin) in 1853; 
1 1 
(EM) Hf py fo be 
T Fe (4) For (x) (Ex B)*, (12) 


(1/c) times the electromagnetic energy flux discovered by Poynting and Heaviside 
in 1884; 


FM) Be ~() FaP™ = (z)e By (- -(z) (E x B)") ; (3) 


(—c) times the electromagnetic momentum density discovered by J. J. Thomson in 


1893; 
1 1 
(EM) v = —— Vv kl va 
eM) (sie) Fu (qe) FoF 


7 (=) {5, [(E)* + (B)"] — ual(B)*(B)” + (B)°(Z)"}}, (14) 


the electromagnetic stress discovered by Maxwell in 1873. Here we use Greek indices 
to run from 1 to 3. 

The importance of constructing the four-dimensional electromagnetic 
stress-momentum-—energy tensor is that it was the first four-dimensional stress— 
momentum-energy tensor ever constructed. For electromagnetic energy to 
gravitate, it should enter the right-hand side of the new covariant (3). However, 
electromagnetic energy is only the (0, 0) component of four-dimensional stress— 
momentum-—energy tensor, other components should enter the right-hand also to 
make it covariant. 

Directions and new ingredients. During 1905-1910, directions and new ingredi- 
ents were formed for a new theory of gravity. We had finite propagation velocity, 
all energy gravitating, EEP, spacetime formulation of special relativity, indefinite 
metric, and four-dimensional covariant electromagnetic stress-momentum-—energy 
(stress-energy) tensor. Two crucial steps are (i) the generalization of the principle 
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of relativity to include situation in gravity, i.e. the EEP; and (ii) the spacetime 
formulation of the (special) relativity theory using Minkowski metric and its gen- 
eralization to the general concept of indefinite spacetime metric. EEP means the 
local physics is special relativistic. Then one can ask what is gravity. It must be 
how various local physics are connected. We have special relativity from locality 
to locality and gravity describes how they are connected. (A mathematical natu- 
ral description is a four-dimensional base manifold with special relativity as fibre 
attached to each (world) point in the base manifold, and gravity is the connection 
bundle or the metric which induces the connection bundle.) Although this logic 
seems compelling, the full metric as dynamical gravitational entity was not used 
until 1913. A test of EEP was derived by Einstein: the gravitational redshift. It 
has been an important test of relativistic gravity which people try to improve the 
accuracy constantly. 


3. The Period of Various Trial Theories: 1911-1914 


Basic formulas of (pseudo-) Riemannian geometry. Here we summarize some basic 
formulas used in developing a new theory of gravity for straightening out the con- 
vention and notation. First, a (pseudo)-Riemannian manifold is endowed with a 
metric gj;. The metric g;; is related to the line element ds as: 


ds* = gijdx' dx). (15) 


If the metric g;; is positive definite, the geometry is Riemannian. If the metric gj; is 
indefinite, the geometry is pseudo-Riemannian. g’! is the matrix inverse of g;; and 
they are used to raise and lower covariant and contravariant indices. For our case, 
the geometry is pseudo-Riemannian. We use the MTW** conventions with signature 
—2; this is also the convention used in Ref. 49. Latin indices run from 0 to 3; Greek 
indices run from 1 to 3. The Christoffel connection I’ jk of the metric is given by 


a 1 a 
"jn = 59 "(guj 1k +9th1j —9iks1): (16) 


With Christoffel connection, one can define covariant derivative. The Rieman- 
nian curvature tensor R® jk? the Ricci curvature tensor Rj), the scalar curvature R 
and the Einstein tensor Gj, are defined as: 


Re jet = joke ST jt HE em yy — Pim jas Ry = Rojas 


. 1 
R= g” Ry; Gil = Ry (5 Jann. (17) 


Gravitational deflection of light and EEP. Extending his work on gravitational 
redshift, Einstein®? derived light deflection in gravitational field using EEP in 1911. 
He argued that since light is a form of energy, light must gravitate and the velocity 
of light must depend on the gravitational potential. He obtained that light passing 
through the limb of the Sun would be gravitationally deflected by 0.83 arc s. This is 
very close to the value 0.84 arc sec derived by Soldner®! in 1801 assuming that light 
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is corpuscular in Newtonian theory of gravitation. This prediction was half the value 
of GR. Before 1919, there were four expeditions intent to measure the gravitational 
deflection of starlight (in 1912, 1914, 1916 and 1918); because of bad weather or war, 
the first three expeditions failed to obtain any results, the results of 1918 expedition 
was never published.®? In 1919, the observation of gravitational deflection of light 
passing near the Sun during a solar eclipse®* confirmed the relativistic deflection of 
light and made GR famous and popular. 

Stress-energy tensor. In 1911, Laue®*°° extended Minkowski’s method of con- 
structing electromagnetic stress-energy tensor to stressed bodies, dust and relativis- 
tic fluids. 

Gravity theories with ‘variable velocity of light’ and scalar theories of gravity. 
Accepting that the velocity of light depends on gravitational potential, Abraham*® 
postulated that the negative gradient indicates the direction of gravitational force 
and worked out a theory of gravity. Einstein®’ worked out a somewhat different 
theory. These gravity theories with ‘variable velocity of light’ led to the proposals 
of conformally flat scalar theories of Nordstrém.** © 

The equation corresponding to Eq. (3) in Newtonian theory for electromag- 


2A. . 
(=) (Se) — VIAj = Aig? = Ari, (18) 


A; =0. (19) 


netism is 


with gauge condition 


Here A; is the electromagnetic 4-potential guaranteed locally by (8b) such that 
Fi; = Aj, — Ai,j. To incorporate the finite propagation speed with light velocity 
into the gravitation field equation, one could just replace (3) with 


1 0? * os 
(=) (Sr) = V20* _ 9 O* 45 = —4nGnp* (a, t), (20) 


where ®*(z, t) is a new gravitational field entity. ®* could be a scalar field, a vector 
field or a tensor field or some combination of them. If ®* is a scalar field, p* must be 
a scalar; in the weak field and slow motion limit, one must be able to approximate 
®* and p* by ® and p. Let us illustrate with Einstein’s theory with ‘variable velocity 
of light’. 

In the original formulation of Einstein,®’ the equation of motion for particles 
was derived from the variational principle 


5 fas=o, (21) 
where 
ds Sed? — de" dy? — de" (22) 


and where c is a scalar function which Einstein regarded as the velocity of light 
in the metric (22). Einstein postulated that c depends on the scalar field y in the 
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following way: 
e=G— Ip (23) 
and that y is generated by p* through the wave equation 


(<) (Ge) — V? p(x, t) = —4Gnp*(x, 1). (24) 


By choosing suitable units, we can set co = 1; and by postulating that Einstein’s 
ds? is the “physical metric”, we can bring the theory into the form: 


ds® = (c2 — 2)dt” — da? — dy” — dz’, (25a) 

1 pag = 4nGnpr. (25b) 

Note that in (25b) as well in (20), we have the a priori (nondynamical) geomet- 
ric element 77 to make the equation fully coordinate covariant. More precisely, 
Eq. (25a) also contains a priori geometric elements — a flat-space metric and a 
time direction. This makes the theory a stratified theory with conformally flat 


space slices. For more detailed discussions, see Refs. 61 and 62. 
The physical metric can always be transformed locally into the Lorentz form 


ds* = cedt? — dx? — dy” — dz?, (26) 


where dt is the proper time interval and dl = (dx? +dy? +dz?)'/?, the proper-length 
element. Since light trajectories all lie on null cones of this metric, the velocity of 
light as measured using the physical metric is always co — as it must be for any 
theory that satisfy the EEP. 

This theory did not agree with the Mercury perihelion advance observation. 
However, it led to the conformally flat theories of Nordstrém.°® © 

The field equations of Nordstrém’s second theory®?”° can be written as 


Cijxt = 0, (27) 


G 
R= 240 (Sr, (28) 


where Cjj;x1 is the Weyl conformal tensor and R is the curvature scalar both con- 
structed from the metric g;;. T’ is the trace contraction of stress-energy tensor. The 
field equations (27) and (28) are geometric and make no reference to any gravita- 
tional fields except the physical metric g;;. However, they guarantee the existence 
of a flat spacetime metric 7; (prior geometry in the language of Ref. 48) and a 
scalar field related to gi; by 


war: (29) 


and they allow y to be calculated from the variational principle 


5| a _ (5) -o?] d*x = 0, (30) 


where g = det(g;;) and Ly is the interaction Lagrangian density of matter with 
gravity (see Ref. 61 for more details). Expressed in terms of y, the field equation (28) 
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becomes 


Be A4ntGn 
1 pij = -( a ) Ty (31a) 
or 
. - 4nrG 
p59 = — (=e x) That- (31b) 


Equation (31) is Nordstrém’s original field equation,°?® while Eq. (28) is the 
Einstein—Fokker version.®*? This second Nordstrém theory did not agree with the 
Mercury perihelion advance observation either. From (14), we note that the trace 
contraction of electromagnetic energy tensor vanishes. Therefore, in this theory the 
electromagnetic energy does not contribute to the generation of gravitational field. 
Neither the gravitational field gives light deflection. Somehow nature did not choose 
this way. Nature chose to make the whole metric dynamic. 

Tensor theory of gravity. In 1913, Einstein and Grossmann turned into tensor 
theory of gravity making full use of the metric. They tried to incorporate all the 
ingredients discussed in the last section into their “Entwurf (outline)” theory®* and 
proposed the following equation using the metric g;; as dynamical entity for the 
gravitational field: 


Part of Ricci tensor Ry x Tj;. (32) 


Since the left-hand side did not contain all the terms of the Ricci tensor, it is not 
covariant. In 1913, Besso and Einstein®® worked out a Mercury perihelion advance 
formula in the “Einstein-Grossmann Entwurf” theory,®* but the calculation con- 
tained an error and the result did not agree with the Mercury perihelion advance 
observation. Nevertheless, the “Entwurf” theory is an important landmark in the 
genesis of GR. 

Einstein became versed at differential geometry and tensor analysis in 1914. A 
quote of Einstein’s October 1914 writing on “The formal foundation of the general 
theory of relativity” ®° 
worked, in part with my friend Grossmann, on a generalization of the theory of 


showed the situation: in the abstract “In recent years I have 


relativity. During these investigations, a kaleidoscopic mixture of postulates from 
physics and mathematics has been introduced and used as heuristical tools; as 
a consequence it is not easy to see through and characterize the theory from a 
formal point of view, that is, only based upon these papers. The primary objective 
of the present paper is to close this gap. In particular, it has been possible to 
obtain the equations of the gravitational field in a purely covariance-theoretical 
manner (Section D). I also tried to give simple derivations of the basic laws of 
absolute differential calculus — in part, they are probably new ones (Section B) — 
in order to allow the reader to get a complete grasp of the theory without having 
to read other, purely mathematical tracts. As an illustration of the mathematical 
methods, I derived the Eulerian equations of hydrodynamics and the field equations 
of the electrodynamics of moving bodies (section C). Section E shows that Newton’s 
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theory of gravitation follows from the general theory as an approximation. The 
mist elementary features of the present theory are also derived inasfar as they are 
characteristic of a Newtonian (static) gravitational field (curvature of light rays, 
shift of spectral lines)”. 

In the 1911-1914 period, various trial theories, based largely on the ingredients 
and directions set in the previous period 1905-1910, emerged and led step by step 
towards the synthesis of GR. 


4. The Synthesis and Consolidation: 1915-1916 


Einstein’s big step. Continued along the direction set in the “Entwurf” theory, 
67,68 reached the following equation for GR in 1915: 


Rij x Dig (33) 


Einstein 


Subsequently, Einstein®? corrected an error made in his collaboration with Besso 
of 1913°° and obtained a value of Mercury perihelion advance from his new equa- 
tion (33) in agreement with the observation.? Apparently, this correct calculation 
played a significant role in the final genesis of GR. The divergence of Tj; vanishes. 
However, the divergence of Rj; does not vanish unless J’ vanishes or is constant. 
Since the trace T(©™) of electromagnetic stress-energy tensor does vanish, Einstein 
argued that:® 

“One now has to remember that by our knowledge “matter” is not to be per- 
ceived as something primitively given or physically plain. There even are those, 
and not just a few, who hope to reduce matter to purely electrodynamic processes, 
which of course would have to be done in a theory more completed than Maxwell’s 
electrodynamics. Now let us just assume that in such completed electrodynamics 
scalar of the energy tensor also would vanish! Would the result, shown above, prove 
that matter cannot be constructed in this theory? I think I can answer this ques- 
tion in the negative, because it might very well be that in “matter”, to which the 
previous expression relates, gravitational fields do form an important constituent. 
In that case, UT/' can appear positive for the entire structure while in reality only 
(TH + ti) is positive and ST/’ vanishes everywhere. In the following we assume 
the conditions ST/’ = 0 really to be generally true”. 

Hilbert variational principle. Shortly after Einstein obtained (33), Hilbert’ pro- 
posed the variational principle for gravitational field: 


6 


5 7 [LAEM) 4. E(GRAV) 4a = 0, (34) 

with the Lagrangian densities ne and L{GRAV) given by 
LE) ox Fa FM (—g)¥?, (35a) 
POE) ce Hey. (35b) 


The variation of integral of (35a) plus the Lagrangian density term of electro- 
magnetic 4-current interaction with electromagnetic 4-potential gives the Maxwell 
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equations. The variation of the integral of (35a) with respect to the metric gives 
the electromagnetic stress—energy tensor. The variation of (35b) together with (35a) 
with respect to the metric would give: 


1 
Rij = Gish x fe aees (36a) 


With more general Lagrangian L; such as Mie’s Lagrangian instead of ce the 
variation of its integral and Hilbert’s gravitational integral with respect to the 
metric would give 


1 
Rij = 59st ax Tj (36b) 


Here 7;; is given using Hilbert’s variation-with-respect-to-the-metric definition. 
Einstein equation. After Hilbert’s work,” Einstein” soon corrected his field 
equation (33) on 25, November 1915 to 
1 


Equation (36a), or Equation (36b) with a replaced by T;;, i.e. 
1 
Rij = 59st x Ti; (38) 


and Eq. (37) are equivalent to each other: by taking a trace contraction of either 
equation, one has 


R«-T (39) 


and the equivalence becomes clear. Since variation principle became common, the 
Einstein equation is normally written in the form (38) nowadays. With the propor- 
tional constant inserted, the Einstein equation is 


Gi; = Rij = sau = 8r (SS) Lj. (40) 
In 1916, Einstein’? wrote a foundational paper on GR. In the same year, Einstein 
performed a linear approximation in the weak field and obtained the quadrupole 
radiation formula’; major errors were corrected in his 1918 paper’ while a factor 
of 2 was corrected by Eddington.”° (For later controversial issues on gravitational 
wave and the quadrupole formula, see e.g. Refs. 76 and 77.) Einstein thought that 
quantum effects must modify GR in his first paper on linear approximation and 
gravitational waves”? although he switched to a different point of view working on 
the unification of electromagnetism and gravitation in the 1930s. The merging of 
GR and quantum theory is an important issue. For a brief history of ideas and 
prospects, see e.g. Ref. 78. 
In 1916, Schwarzschild discovered an exact spherical solution (Schwarzschild 
solution) of Einstein equation.”9®° 
In 1917, Einstein®! postulated the cosmological principle, applied GR to cos- 
mology and proposed the cosmological constant; de Sitter? ®° followed in the same 
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Table 3. Historical steps in the genesis of GR since the genesis of special relativity. 

Year Reference Historical step 

1905-1906 Poincaré?7-41 Attempt to formulate an action at a distance theory of 
gravity with finite propagation velocity compatible 
with principle of relativity 

1907 Planck? All energy must gravitate 

1907-1908 Hinstein*? Generalized principle of relativity (EEP) 
and the prediction of gravitational redshift 

1907-1908 Minkowski??:44 Covariant spacetime formulation of electromagnetism 
and the derivation of four-dimensional electromagnetic 
stress—energy tensor 

1909-1910 Bateman*® Introducing indefinite spacetime metric 

1911 Einstein®? Using EEP to derive deflection of light in gravitational 
field 

1911 Laue®*55 Stress—energy tensor of matter 

1911-1912 Abraham,°® Einstein®” Theories with ‘variable velocity of light’ 

1912 Nordstrém®® Norstr6m’s first theory 

1913 Nordstrom®?:60 Norstr6m’s second theory 

1913 Einstein and Grossman®* “Entwurf (Outline)” theory 

1914 Einstein and Fokker®* Covariant formulation of Norstrém’s second theory 

1914 Einstein® Einstein versed at covariant formulation 

1915 Einstein®?-69 Source restricted Einstein equation 

1915 Hilbert” Hilbert variational principle 

1915 Einstein?! Einstein equation 

1916 Einstein’? Einstein’s foundation paper of general relativity 

1916 Einstein’? Approximate solution and gravitational waves 

1916 Schwarzschild’® Exact spherical solutions of Einstein equation 

1917 Einstein’! Cosmological principle, cosmology and cosmological 
constant 

1917 de Sitter82-85 de Sitter inflationary solution (cosmology) 

1918 Einstein?4,75 Quadrapole radiation formula 

1918 Lense-Thirring?? Lense-Thirring gravitomagnetic effect 


year with an inflationary solution to cosmology. Ever since the genesis of GR, it 
went hand-in-hand with the development of cosmology. For a brief history of this 
connection and mutual development, see e.g. Ref. 86; for recent reviews on various 
topics in cosmology, see e.g. Refs. 87-92. 

In 1918, Lense and Thirring®* discovered the frame-dragging effect in GR. 

After one hundred years of developments, GR becomes indispensable in preci- 
sion measurement, astrophysics, cosmology and theoretical physics. The first direct 
detections of gravitational waves**® in the centennial of the genesis of GR truly 
celebrate this occasion. 

The route to GR is indeed guided by covariance. However, when GR. is reached 
and covariance is fully understood, the principle of covariance could accommodate 
various things, scalars, vectors, a priori objects, etc. and various theories of gravity. 
It is probably a minimax principle that worked in nature: When an entity is needed, 
it should saturate its maximal capacity. 

In Table 3, we list historical steps in the genesis of GR discussed in the last 
section and this section. 


Genesis of general relativity — A concise exposition T-103 


5. Epilogue 


The study of histories gives inspiration. The study of the genesis of GR is clearly so. 
It is fortunate that most of the records are intact and the step-by-step development 
are transparent. We hope that this short exposition presents the flavor and some 
insights of the development. The genesis of GR was a community effort with Ein- 
stein clearly dominated the scene. Knowledge accrues gradually most of the time. 
Cognition sometimes comes in somewhat bigger steps. The cognition that energy 
must gravitates and that the EEP must be valid are such examples. Initial cogni- 
tion needs consolidation and development. EEP indicates that local physics must be 
special relativistic. Minkowski’s spacetime formulation indicates that local physics 
must have an indefinite metric. To take metric as basic entity for gravitation took 
a few years through studying gravitational redshift, gravitational light deflection, 
theories of gravity with “variable velocity of light” and more scalar theories of grav- 
ity. Eventually, metric as a full dynamic entity for gravitation emerged in 1913. It 
took a couple of years to master this approach for leading to the genesis of GR. 
During different phases of genesis, the first discovered general relativistic effect — 
the Mercury perihelion advance anomaly played a key role. 


Acknowledgments 


I would like to thank Science and Technology Commission of Shanghai Municipality 
(STCSM-14140502500) and Ministry of Science and Technology of China (MOST- 
2013YQ150829, MOST-2016YFF0101900) for supporting this work in part. 


References 


1. I. Newton, Philosophiae Naturalis Principia Mathematica (Streater, London, 1687). 

2. E. Whittaker, A History of the Theories of Aether and Electricity II. The Modern 
Theories (Philosophical Library, 1954; American Institute of Physics, 1987). 

3. J. Kepler, Astronomia Nova de Motibus Stellae Martis (Prague, 1609); Harmonice 
Mundi (Linz, 1619). 

4. C. Wilson, The great inequality of Jupiter and Saturn from Kepler to Laplace, Ph. D. 
thesis, St. John’s College, Annapolis, Maryland, January 10, 1984, www.docin.com/p- 
1627154943. html. 

5. P. Moore, The Story of Astronomy, 5th rev. ed. (Grosset & Dunlap Publishing, New 
York, 1977). 

6. U. J. J. Le Verrier, Theorie du mouvement de Mercure, Ann. Observ. imp. Paris 
(Mém.) 5 (1859) 1. 

7. U.J. J. Le Verrier, Nouvelles recherches sur les mouvements des planétes, C. R. Acad. 
Sci. Paris 29 (1849); the English translation of the quote in the text is from Ref. 8. 

8. N. T. Roseveare, Mercury’s Perihelion from Le Verrier to Einstein (Clarendon Press, 
Oxford, 1982). 

9. S. Newcomb, Discussion and results of observations on transits of Mercury from 1677 
to 1881, Astr. Pap. Am. Ephem. Naut. Alm. 1 (1882) 367. 

10. I. I. Shapiro, Solar system tests of GR: Recent results and present plans in General 
Relativity and Gravitation: Proc. 12th Int. Conf. General Relativity and Gravitation, 


1-104 W.-T. Ni 


11. 


12. 


13. 


14. 


15. 


16. 


17. 


18. 


19: 


20. 


Die 
22. 


23. 


24. 


25. 


26. 
27. 
28. 
29. 
30. 


31. 
32. 


University of Colorado at Boulder, July 2-8, 1989, eds. N. Ashby, D. F. Bartlett and 
W. Wyss (Cambridge, Cambridge University Press, 1990), pp. 313-330. 

W.-T. Ni, Solar-system tests of the relativistic gravity, in One Hundred Years of 
General Relativity: From Genesis and Foundations to Gravitational Waves, Cosmology 
and Quantum Gravity, Chap. 8, eds. by W.-T. Ni (World Scientific, Singapore, 2016); 
Int. J. Mod. Phys. D 25 (2016) 1630003. 

A. A. Michelson and E. W. Morley, On the relative motion of the Earth and the 
luminiferous ether, Am. J. Sci. 34 (1887) 333. 

M. Nagel, S. R. Parker, E. V. Kovalchuk, P. L. Stanwix, J. G. Hartnett, E. N. Ivanov, 
A. Peters and M. E. Tobar, Direct terrestrial test of Lorentz symmetry in electrody- 
namics to 10718, Nat. Commun. 6 (2015) 8174. 

R. V. Eétvés, Math. Naturwiss. Ber. Ungarn 8 (1889) 65. 

G. Galilei, Discorsi e Dimostrazioni Matematiche Intorno a due Muove Scienze 
(Elzevir, Leiden, 1638). 

T. A. Wagner, S. Schlamminger, J. H. Gundlach and E. G. Adelberger, Torsion- 
balance tests of the weak equivalence principle, Class. Quantum Grav. 29 (2012) 
184002. 

P. Touboul, M. Rodrigues, G. Métris and B. Tatry, MICROSCOPE, testing the equiv- 
alence principle in space, C. R. Acad. Sci. Ser. TV 2(9) (2001) 1271. 
MICROSCOPE Collaboration, CNES ONERA cooperation first ultra-precise mea- 
surements from Microscope (September 27, 2016) https: //presse.cnes.fr/en/cnes- 
onera-cooperation-first-ultra-precise-measurements-microscope. 

W.-T. Ni, Equivalence principles, spacetime structure and the cosmic connection, in 
One Hundred Years of General Relativity: From Genesis and Foundations to Grav- 
itational Waves, Cosmology and Quantum Gravity, Chap. 5, ed. W.-T. Ni (World 
Scientific, Singapore, 2016); Int. J. Mod. Phys. D 25 (2016) 1630002. 

H. A. Lorentz, Kon. Neder. Akad. Wet. Amsterdam. Versl. Gewone Vergad. Wisen 
Natuurkd. Afd. 6 (1904) 809. 

H. Poincaré, Bibl. Congr. Int. Philos. 3 (1900) 457. 

H. Poincaré, La Science et  Hypothése (Flammarion, Paris, 1902) (in German); Sci- 
ence and Method (Dover Publications Inc., London, 1952). 

H. Poincaré, L’état et Vavenir de la physique mathematique, Bull. Sci. Math. 
28 (1904) 302; the English translation in the text is from Ref. 26. 

H. Poincaré, Electricité et Optique: La Lumiére et les Théories Electrodynamiques 
(Gauthier-Villars, Paris, 1901). 

V. Messager and C. Letellier, A genesis of special relativity, in One Hundred Years of 
General Relativity: From Genesis and Foundations to Gravitational Waves, Cosmology 
and Quantum Gravity, Chap. 1, ed. W.-T. Ni (World Scientific, Singapore, 2016); Int. 
J. Mod. Phys. D 25 (2015) 1530024. 

C. Marchal, Sciences 97 (1997) 2. 

H. Poincaré, Sur la dynamique de I’electron, C. R. Acad. Sci. 140 (1905) 1504. 

A. Einstein, Zur elektrodynamik bewegter K6rper [On the Electrodynamics of Moving 
Bodies], Ann. Phys. 17 (1905) 891. 

A. Einstein, Ist die Tragheit eines Korpers von seinem Energieinhalt abhangig? Ann. 
Phys. 18 (1905) 639. 

A. Einstein, Das Prinzip von der Erhaltung der Schwerpunktsbewegung und die 
Tragheit der Energie, Ann. Phys. 20 (1906) 627. 

M. Planck, Verh. Dtsch. Phys. Ges. 8 (1906) 136. 

H. Minkowski, Die Grundgleichungen fiir die elektromagnetischen Vorgange 
in bewegten Korpern, Kénigliche Gesellschaft der Wissenschaften zu Gttingen. 


33. 
34. 
35. 


36. 
37. 
38. 
39. 
40. 


41. 
42. 


43. 


44, 


45. 


46. 


47. 


48. 


49. 


50. 


51. 


Genesis of general relativity — A concise exposition T-105 


Mathematisch-Physikalische Klasse. Nachrichten, pp. 53-111 (1908); this paper was 
read before the Academy on 21 December 1907; (English translation) The fundamen- 
tal equations for electromagnetic processes in Moving bodies, translated from Ger- 
man by Meghnad Saha and Wikisource, Available at: http: //en.wikisource.org/wiki/ 
Translation: The_Fundamental_Equations_for_Electro. 


B. Riemann, Uber die Hypothesen welche der Geometrie zu Grunde liegen (On the 
hypotheses which underlie geometry), lecture at Gottingen (1854). 

E. B. Christoffel, J. Math. 70 (1869) 241. 

F. Klein, Programm zum Eintritt in die Philosophische Fakultat der Universitat. zu 
Erlangen, Erlangen, A. Deichert (1872); Reprinted in 1893 in Math. Ann. 63, and in 
Klein’s Ges. Math. Abhandl. I, 460. 

C. Niven, Trans. R. Soc. Edinburgh 27 (1874) 473. 

W. Thomson, Philos. Trans. 146 (1856) 481. 

W. J. M. Rankine, Philos. Trans. 146 (1856) 261. 

J. W. Gibbs, Vector Analysis (New Haven, 1881-1884), p. 57. 

G. Ricci and T. Levi-Civita, Méthodes de calcul différentiel absolu et leurs applica- 
tions, Math. Ann. 54 (1900) 125. 

H. Poincaré, Sur la dynamique de l’electron, Rend. Circ. Mat. Palermo 21 (1906) 129. 
M. Planck, Zur dynamik bewegter systeme, Sitz. K.-Preuss. Akad. Wiss. (1907) 542 
(Specially at p. 544) (in German); On the dynamics of moving systems (1907) (in 
German); M. Planck, translated from German by Wikisource, Availabel at: https: //en. 
wikisource.org/wiki/Translation:On_the-Dynamics_of_Moving Systems. 

A. Einstein, Uber das Relativitatprinzip und die aus demselben gezogenen Folgerun- 
gen, Jahrb. Radioakt. Elektron. 4 (1907) 411 (in German); Corrections by Einstein in 
Jahrb. Radioakt. Elektronik 5 (1908) 98; English translations by H. M. Schwartz in 
Am. J. Phys. 45 (1977) 811. 

H. Minkowski, Raum und Zeit, Phys. Z. 10 (1909) 104 (in German); H. A. Lorentz, 
A. Einstein, H. Minkowski and H. Weyl, The Principle of Relativity, translated by 
W. Perrett and G. B. Jeffery (Dover, New York, 1952). 

R. Hargreaves, Integral forms and their connection with physical equations, Camb. 
Philos. Trans. 21 (1908) 107. 

H. Bateman, The transformation of the electrodynamical equations, Proc. Camb. 
Math. Soc., Ser. 28 (1910) 223. 

F. Kottler, Uber die Raumzeitlinien der Minkowski’schen Weit, Sitz. Akad. Wiss. 
Wien, Math.-Naturw. Kl. Abt. IIa 121 (1912) 1688. 

C. W. Misner, K. S. Thorne and J. A. Wheeler, Gravitation (Freeman, San Francisco, 
1973). 

K. Kuroda, W.-T. Ni and W.-P. Pan, Gravitational waves: Classification, methods 
of detection, sensitivities, and sources, in One Hundred Years of General Relativity: 
From Genesis and Empirical Foundations to Gravitational Waves, Cosmology and 
Quantum Gravity, Chap. 10, ed. W.-T. Ni (World Scientific, Singapore, 2016); Int. J. 
Mod. Phys. D 24 (2015) 1530031. 

A. Einstein, Uber den einflu8 der Schwerkraft auf die Ausbreitung des Lichtes, Ann. 
Phys. 35 (1911) 898; translated “On the Influence of Gravitation on the Propagation 
of Light” in The collected papers of Albert Einstein, Vol. 3: The Swiss years: writings, 
1909-1911 (Princeton University Press, Princeton, New Jersey, 1994), Anna Beck 
translator. 

J. G. von Soldner, On the deviation of a light ray from its motion along a straight 
line through the attraction of a celestial body which it passes close by, Astronomishes 
Jahrbuch fiir das Jahr 1804 (C. F. E. Spathen, Berlin, 1801), pp. 161-172. 


1-106 W.-T. Ni 


52. 


53. 


54. 
55. 
56. 
57. 
58. 
59. 
60. 


61. 


62. 


63. 
64. 


65. 


66. 


67. 


68. 


69. 


J. Earman and C. Glymour, Relativity and eclipses: The British eclipse expeditions 
of 1919 and their predecessors, in Historical Studies in the Physical Sciences, Vol. 11 
(1980), pp. 49-88. 

F. Dyson, A. Eddington and C. Davidson, A determination of the deflection of light 
by the Sun’s gravitational field, from observations made at the total eclipse of May 
29, 1919, Philos. Trans. R. Soc. 220A (1920) 291. 

M. von Laue, Zur Dynamik der Relativitatstheorie, Annalen der Physik 35 (1911) 
524-542. 

M. von Laue, Das Relativitdtsprinzip (Friedrich Vieweg und Sohn, 1911). 

M. Abraham, Lincei Atti 20 (1911) 678. 

A. Einstein, Ann. Phys. 38 (1912) 355, 443. 

G. Nordstrém, Relativitatsprinzip und Gravitation, Phys. Zeit. 13 (1912) 1126-1129. 
G. Nordstrém, Zur Theorie der Gravitation vom Standpunkt der Relativitatsprinzip, 
Ann. Phys. 42 (1913) 533. 

G. Nordstrém, Die Fallgesetze und Planeten bewegung in der Relativitatstheorie, 
Ann. Phys. 43 (1914) 1101. 

W.-T. Ni, Theoretical frameworks for testing relativistic gravity, [V: A compendium 
of metric theories of gravity and their post-Newtonian limits, Astrophys. J. 176 (1972) 
769. 

G. J. Whitrow and G. E. Morduch, Relativistic theories of gravitation, in Vistas in 
Astronomy, ed. A. Beer, Vol. 6 (Pergamon Press, Oxford, 1965), pp. 1-67. 

A. Einstein and A. D. Fokker, Ann. Phys. 44 (1914) 321. 

A. Einstein and M. Grossmann, Entwurf einer verallgemeinerten Relativitatstheorie 
und einer Theorie der Gravitation [Outline of a Generalized Theory of Relativity and 
of a Theory of Gravitation], Z. Math. Phys. 62 (1913) 225. 

A. Einstein and M. Besso, Manuscript on the motion of the perihelion of Mercury 
(1913) 360, (in German), in The Collected Papers of Albert Einstein, Vol. 4: The Swiss 
Years: Writings, 1912-1914 Albert Einstein, eds. M. J. Klein, A. J. Kox, J. Renn and 
R. Schulmann (Princeton University Press, 1995), see also the editorial note on p. 344; 
available at http://press.princeton.edu/einstein/digital/. 

A. Einstein, Die formale Grundlage der allgemeinen Relativitatstheorie, Sitz. ber. 
Akad. Wiss. 1914 (1914) 1030 (in German) The Collected Papers of Albert Einstein, 
Vol. 6: The Berlin Years: Writings, 1914-1917 Albert Einstein (English translation 
supplement), eds. by M. J. Klein, A. J. Kox, J. Renn and R. Schulmann (Prince- 
ton University Press, 1995), pp. 30-84; available at: http://press.princeton.edu/ 
einstein /digital/. 

A. Einstein, Ziir allgemeinen Relativitatstheorie, Sitzber. Preuss. Akad. Wiss. 1915 
(1915) 778 (in German); in The Collected Papers of Albert Einstein, Vol. 6: The 
Berlin Years: Writings, 1914-1917 Albert Einstein (English translation supplement), 
eds. M. J. Klein, A. J. Kox, J. Renn and R. Schulmann (Princeton University Press, 
1995), pp. 98-107; available at: http://press.princeton.edu/einstein /digital/. 

A. Einstein, Ziir allgemeinen Relativitatstheorie (Nachtrag), Sitz. ber. Preuss. Akad. 
Wiss. (1915) 799 (in German); in The Collected Papers of Albert Einstein, Vol. 6: The 
Berlin Years: Writings, 1914-1917 Albert Einstein (English translation supplement) 
pp. 108-110, eds. M. J. Klein, A. J. Kox, J. Renn and R. Schulmann (Princeton 
University Press, 1995); available at: http://press.princeton.edu/einstein/digital/. 

A. Einstein, Erklérung der Perihelbewegung des Merkur aus allgemeinen Rela- 
tivitatstheorie, Sitz. ber. Preuss. Akad. Wiss. (1915) 831 (in German); English trans- 
lation in The Collected Papers of Albert Einstein, Vol. 6: The Berlin Years: Writings, 
1914-1917 Albert Einstein (English translation supplement) eds. by M. J. Klein, 


70. 


71: 


12. 


73. 


74. 


75. 


76. 


C6: 


78. 


79. 


80. 


81. 
82. 
83. 


84. 
85. 


Genesis of general relativity — A concise exposition T-107 


A. J. Kox, J. Renn and R. Schulmann (Princeton University Press, 1995), pp. 112-116, 
available at: http://press.princeton.edu/einstein/digital/. 

D. Hilbert, Die Grundlagen der Physik, K. Ges. Wiss. Gott. Nachr. Math.-Phys. K. 
(1915) 395. 

A. Einstein, Die Feldgleichungen der Gravitation, Sitz. ber. Preuss. Akad. Wiss. 
(1915) 844 (in German); The Collected Papers of Albert Einstein, Vol. 6: The Berlin 
Years: Writings, 1914-1917 Albert Einstein (English translation supplement) eds. M. 
J. Klein, A. J. Kox, J. Renn and R. Schulmann (Princeton University Press, 1995) 
pp. 117-120, available at: http://press.princeton.edu/einstein/digital/. 

A. Einstein, Die Grundlage der allgemeinen Relativitatstheorie, Ann. Phys. 49 (1916) 
769. 

A. Einstein, Naherungsweise Integration der Feldgleichungen der Gravitation, Sitz. 
ber Preuss. Akad. Wiss. (1916) 688 (in German) (translated by Alfred Engel) in The 
Collected Papers of Albert Einstein, Vol. 6: The Berlin Years: Writings, 1914-1917 
(English translation supplement, Doc. 32 (Princeton University Press, 1997), pp. 201— 
210, Available at: http: //einsteinpapers.press.princeton.edu/vol6-trans). 

A. Einstein, Uber Gravitationswellen, Sitz. ber. K. Preuss. Akad. Wiss. (1918) (in 
German) 154; (in German) (translated by Alfred Engel) in The Collected Papers of 
Albert Einstein, Vol. 7: The Berlin Years: Writings, 1918-1921 (English translation 
supplement, Doc. 1 On gravitational waves (Princeton University Press), pp. 9-27, 
Available at: http://einsteinpapers.press.princeton.edu/vol7-trans). 

A. S. Eddington, The propagation of gravitational waves, Proc. R. Soc. Lond. A 102 
(1922) 268. 

D. Kennefick, Traveling at the Speed of Thought: Einstein and the Quest for Gravita- 
tional Waves (Princeton University Press, Princeton, 2007). 

C.-M. Chen, J. M. Nester and W.-T. Ni, A brief history of gravitational wave research, 
Chin. J. Phys. (2016), Available at: http://dx.doi.org/10.1016/j.cjph.2016.10.014. 

S. Carlip, D.-W. Chiou, W.-T. Ni and R. Woodard, Quantum gravity: A brief his- 
tory of ideas and some prospects, in One Hundred Years of General Relativity: From 
Genesis and Foundations to Gravitational Waves, Cosmology and Quantum Gravity, 
ed. W.-T. Ni (World Scientific, Singapore, 2016); Int. J. Mod. Phys. D 25 (2015) 
1530028. 

K. Schwarzschild, “Uber das Gravitationsfeld eines Massenpunktes nach der Ein- 
steinschen Theorie”, Sitzungsberichte der Kéniglich Preussischen Akademie der Wis- 
senschaften 7 (1916) 189-196; English Translation, S. Antoci and A. Loinger, 
“On the gravitational field of a mass point according to Einstein’s theory”, 
ar Xiv:physics/9905030. 

C. Heinicke and F. W. Hehl, Schwarzschild and Kerr solutions of Einstein’s field 
equation: An introduction, in One Hundred Years of General Relativity: From Genesis 
and Foundations to Gravitational Waves, Cosmology and Quantum Gravity, Chap. 3, 
ed. W.-T. Ni (World Scientific, Singapore, 2016); Int. J. Mod. Phys. D 25 (2015) 
1530006. 

A. Einstein, Kosmologische Betrachtungen zur allgemeinen Relativitaetstheorie, Sitz. 
ber. K. Preuss. Akad. Wiss. Berlin 1 (1917) 142. 

W. de Sitter, On the relativity of inertia: Remarks concerning Einstein’s latest hypoth- 
esis, Proc. Kon. Ned. Akad. Wet. 19 (1917) 1217. 

W. de Sitter, The curvature of space, Proc. Kon. Ned. Akad. Wet. 20 (1917) 229. 
W. de Sitter, Proc. Kon. Ned. Akad. Wet. 20 (1917) 1309. 

W. de Sitter, Mon. Not. R. Astron. Soc. 78 (1917) 3. 


1-108 W.-T. Ni 


86. 


87. 


88. 


89. 


90. 


91. 


92. 


93. 


94. 


95. 


M. Bucher and W.-T. Ni, General relativity and cosmology, in One Hundred Years of 
General Relativity: From Genesis and Foundations to Gravitational Waves, Cosmology 
and Quantum Gravity, Chap. 13, ed. W.-T. Ni (World Scientific, Singapore, 2016); 
Int. J. Mod. Phys. D 24 (2015) 1530030. 

M. Davis, Cosmic structure in One Hundred Years of General Relativity: From Genesis 
and Empirical Foundations to Gravitational Waves, Cosmology and Quantum Gravity, 
ed. W.-T. Ni (World Scientific, Singapore, 2016); Int. J. Mod. Phys. D 23 (2014) 
1430011. 

M. Bucher, Physics of the cosmic microwave background anisotropy, in One Hundred 
Years of General Relativity: From Genesis and Empirical Foundations to Gravitational 
Waves, Cosmology and Quantum Gravity, ed. W.-T. Ni (World Scientific, Singapore, 
2016); Int. J. Mod. Phys. D 24 (2015) 1530004. 

X. Meng, Y. Gao and Z. Han, SNe Ia as a cosmological probe, in One Hundred Years of 
General Relativity: From Genesis and Empirical Foundations to Gravitational Waves, 
Cosmology and Quantum Gravity, ed. W.-T. Ni (World Scientific, Singapore, 2016); 
Int. J. Mod. Phys. D 24 (2015) 1530029. 

T. Futamase, Gravitational lensing in cosmology, in One Hundred Years of General 
Relativity: From Genesis and Empirical Foundations to Gravitational Waves, Cos- 
mology and Quantum Gravity, ed. W.-T. Ni (World Scientific, Singapore, 2016); Int. 
J. Mod. Phys. D 24 (2015) 530011. 

K. Sato and J. Yokoyama, Inflationary cosmology: First 30+ Years, in One Hundred 
Years of General Relativity: From Genesis and Empirical Foundations to Gravitational 
Waves, Cosmology and Quantum Gravity, ed. W.-T. Ni (World Scientific, Singapore, 
2016); Int. J. Mod. Phys. D 24 (2015) 1530025. 

D. Chernoff and H. Tye, Inflation, string theory and cosmic strings, in One Hundred 
Years of General Relativity: From Genesis and Empirical Foundations to Gravitational 
Waves, Cosmology and Quantum Gravity, ed. W.-T. Ni (World Scientific, Singapore, 
2016); Int. J. Mod. Phys. D 24 (2015) 1530010. 

Lense and H. Thirring, Phys. Z. 19 (1918) 156 (in German); Gen. Relativ. Gravit. 
16 (1984) 712. 

B. P. Abbott, LIGO Scientific and Virgo Collab., Observation of gravitational waves 
from a binary black hole merger, Phys. Rev. Lett. 116 (2016) 061102. 

B. P. Abbott, LIGO Scientific and Virgo Collab., GW151226: Observation of gravi- 
tational waves from a 22-solar-mass binary black hole coalescence, Phys. Rev. Lett. 
116 (2016) 241103. 


Chapter 3 


Schwarzschild and Kerr solutions of Einstein’s 
field equation: An Introduction 


Christian Heinicke*:? and Friedrich W. Hehl*:13 


*Institute for Theoretical Physics, 
University of Cologne, 50923 Kéln, Germany 


‘Department of Physics and Astronomy, 
University of Missouri, Columbia, MO 65211, USA 
t christian. heinicke @t-online.de 


8 hehl@thp.uni-koeln. de 


Starting from Newton’s gravitational theory, we give a general introduction into the 
spherically symmetric solution of Einstein’s vacuum field equation, the Schwarzschild (— 
Droste) solution, and into one specific stationary axially symmetric solution, the Kerr 
solution. The Schwarzschild solution is unique and its metric can be interpreted as 
the exterior gravitational field of a spherically symmetric mass. The Kerr solution is 
only unique if the multipole moments of its mass and its angular momentum take on 
prescribed values. Its metric can be interpreted as the exterior gravitational field of a 
suitably rotating mass distribution. Both solutions describe objects exhibiting an event 
horizon, a frontier of no return. The corresponding notion of a black hole is explained 
to some extent. Eventually, we present some generalizations of the Kerr solution. 


Keywords: General relativity; Kerr and Schwarzschild solutions; black holes; gravito- 
electromagnetism; torsion. 
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1. Prelude*® 


In Sec. 1.1, we provide some background material on Newton's theory of gravity and, 
in Sec. 1.2, on the flat and gravity-free Minkowski space of special relativity theory. 
Both theories were superseded by Einstein's gravitational theory, general relativity. 
In Sec. 1.3, we supply some machinery for formulating Einstein's field equation 
without and with the cosmological constant. 


1.1. Newtonian gravity 


Newton's gravitational theory is described — in particular tidal gravitational 
forces — and applied to a spherically symmetric body (a “star”). 


@Parts of Secs. 1 and 2 are adapted from our presentation®? in Falcke et al.5° 
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Gravity exists in all bodies universally and is proportional to the quan- 
tity of matter in each |...] If two globes gravitate towards each other, and 
their matter is homogeneous on all sides in regions that are equally distant 
from their centers, then the weight of either globe towards the other will be 
inversely as the square of the distance between the centers. 


Isaac Newton!*6 (1687) 


The gravitational force of a point-like mass mz on a similar one of mass my, is 
given by Newton’s attraction law 


myms, &r 
Pig =+@ (1) 
Ir]? |r| 


where G is Newton’s gravitational constant (CODATA, 2010), 


4 
G = 6.67384(80) x on sy 


The vector r := ry, — rg points from mz to mj, see Fig. 1. 

According to actio = reactio (Newton’s third law), we have Fo. = —Fi—2. 
Thus, a complete symmetry exists of the gravitational interaction of the two 
masses onto each other. Let us now distinguish the mass mz as field-generating 
active gravitational mass and my, as (point-like) passive test-mass. Accordingly, we 
introduce a hypothetical gravitational field as describing the force per unit mass 
(mz — M,mi — m) 


poF_ GM r (2) 


m Ir|? |r| 


xX 


Fig. 1. Two mass points m; and mg attracting each other in three-dimensional space, Cartesian 
coordinates x, y, Zz. 
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Fig. 2. The “source” M attracts the test mass m. 


With this definition, the force acting on the test-mass m is equal to field strength x 
gravitational charge (mass) or Fay, = mf, in analogy to electrodynamics. The 
active gravitational mass M is thought to emanate a gravitational field which is 
always directed to the center of M and has the same magnitude on every sphere with 
M as center, see Fig. 2. Let us now investigate the properties of the gravitational 
field (2). Obviously, there exists a potential 


M 
lr|’ 
Accordingly, the gravitational field is curl-free: V x f = 0. 


By assumption it is clear that the source of the gravitational field is the mass 
M. We find, indeed 


o=-G f=—V¢. (3) 


V -f=-4nGM8*(r), (4) 


where 63(r) is the three-dimensional (3D) delta function. By means of the Laplace 
operator A := V-V, we infer for the gravitational potential 


Ad = 4nGM83(r). (5) 


The term M63(r) may be viewed as the mass density of a point mass. Equation (5) 
is a second order linear partial differential equation for ¢. Thus, the gravitational 
potential generated by several point masses is simply the linear superposition of 
the respective single potentials. Hence, we can generalize the Poisson equation (5) 
straightforwardly to a continuous matter distribution p(r) 


Ad = 4rGp. (6) 


This equation interrelates the source p of the gravitational field with the gravi- 
tational potential @ and thus completes the quasi-field theoretical description of 
Newton’s gravitational theory. 
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We speak here of quasi-field theoretical because the field ¢ as such represents 
a convenient concept. However, it has no dynamical properties, no genuine degrees 
of freedom. The Newtonian gravitational theory is an action at a distance theory 
(also called mass-interaction theory). When we remove the source, the field vanishes 
instantaneously. Newton himself was very unhappy about this consequence. There- 
fore, he emphasized the preliminary and purely descriptive character of his theory. 
But before we liberate the gravitational field from this constraint by equipping it 
with its own degrees of freedom within the framework of general relativity theory, 
we turn to some properties of the Newtonian theory. 

A very peculiar fact characteristic to the gravitational field is that the acceler- 
ation of a freely falling test-body does not depend on the mass of this body but 
only on its position within the gravitational field. This comes about because of the 
equality (in suitable units) of the gravitational and the inertial mass 


inertial ,, grav 


r=F="m f. (7) 


This equality has been well tested since Galileo’s time by means of pendulum and 
other experiments with an ever increasing accuracy, see Will.189 

In order to allow for a more detailed description of the structure of a gravita- 
tional field, we introduce the concept of tidal force. This can be best illustrated by 
means of Fig. 3. In a spherically symmetric gravitational field, for example, two 
test-masses will fall radially toward the center and thereby get closer and closer. 
Similarly, a spherical drop of water is deformed to an ellipsoidal shape because 
the gravitational force at its bottom is bigger than at its top, which has a greater 
distance to the source. If the distance between two freely falling test masses is rel- 
atively small, we can derive an explicit expression for their relative acceleration by 
means of a Taylor expansion. Consider two mass points with position vectors r and 


tidal 


i acceleration 7 


Fig. 3. Tidal forces emerging between two freely falling particles and deforming a spherical body. 
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r+ or, with |dr| < 1. Then the relative acceleration reads 
da = [f(r + or) — f(r)] = or- (Vf). (8) 


We may rewrite this according to (the sign is conventional, 0/0x% =: 0,, « = 2, 


=e =2) 


Kab — —(Vf).4 = —Oafo, a, b= Ls 2; 3: (9) 


We call Kay the tidal force matrix. The vanishing curl of the gravitational field 
is equivalent to its symmetry, Kay = Koa. Furthermore, Kay = O0,0)¢. Thus, the 
Poisson equation becomes 
3 
¥ Kaa = trace K = 4rGp. (10) 
a=1 
Accordingly, in vacuum Kp is trace-free. 

Let us now investigate the gravitational potential of a homogeneous star with 
constant mass density po and total mass Mg = (4/3)7R4,po. For our Sun, the 
radius is Re = 6.9598 x 10° m and the total mass is M = 1.989 x 10°° kg. 

Outside the sun (in the idealized picture we are using here), we have vacuum. 
Accordingly, p(r) = 0 for |r| > Ro. Then the Poisson equation reduces to the 
Laplace equation 


Ag=0, forr> Ro. (11) 
In 3D polar coordinates, the r-dependent part of the Laplacian has the form 
(1/r?)0,(r?0,). Thus, (11) has the solution 
ay 


where a and @ are integration constants. Requiring that the potential tends to zero 
as r goes to infinity, we get 6 = 0. The integration constant a will be determined 
from the requirement that the force should change smoothly as we cross the star’s 
surface, that is, the interior and exterior potential and their first derivatives have 
to be matched continuously at r= Ro. 

Inside the star we have to solve 


Ag=4rGpo, forr < Ro. (13) 
We find 
2 C 
o) = 3™Gpor + = + Co, (14) 


with integration constants C,; and C2. We demand that the potential in the center 
r = 0 has a finite value, say ¢9. This requires C; = 0. Thus 

2 GM(r 

b= IaCpor? + by = SE) + 45, (15) 


where we introduced the mass function M(r) = (4/3)rr3po which measures the 
total mass inside a sphere of radius r. 
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Ro 


interior exterior —> 


0 


Fig. 4. Newtonian potential of a homogeneous star. 


Continuous matching of ¢ and its first derivatives at r = Ro finally yields 


G52 for |r| > Ro, 
re M 3GM, 
COR ler IRs for |r| < Ro. 


The slope of this curve indicates the magnitude of the gravitational force, the 
curvature (second derivative) the magnitude of the tidal force (or acceleration). 


1.2. Minkowski space 


When, in a physical experiment, gravity can be safely neglected, we seem to live in 
the flat Minkowski space of special relativity theory. We introduce the metric of the 
Minkowksi space and rewrite it in terms of so-called null coordinates, that is, we 
use light rays for a parametrization of Minkowski space. 


Henceforth space by itself, and time by itself, are doomed to fade away 


into mere shadows, and only a kind of union of the two will preserve an 
independent reality. 


Hermann Minkowski (1908) 


It was Minkowski who welded space and time together into spacetime, thereby 
abandoning the observer-independent meaning of spatial and temporal distances. 
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Instead, the spatio-temporal distance, the line element 


de® = —c' dt” + da? + dy? + dz? 


is distinguished as the invariant measure of spacetime. The Poincaré (or inhomoge- 
neous Lorentz) transformations form the invariance group of this spacetime metric. 
The principle of the constancy of the speed of light is embodied in the equation 
ds? = 0. Suppressing one spatial dimension, the solutions of this equation can be 
regarded as a double cone. This light cone visualizes the paths of all possible light 
rays arriving at or emitted from the cone’s apex. Picturing the light cone structure, 
and thereby the causal properties of spacetime, will be our method for analyzing 
the meaning of the Schwarzschild and the Kerr solution. 


1.2.1. Null coordinates 


We first introduce so-called null coordinates. The Minkowski metric (with c = 1), 
in spherical polar coordinates reads 


ds” = —dt? + dr? + r?(d6? + sin? @d¢”) = —dt? + dr? + r7dO?. (17) 
We define advanced and retarded null coordinates according to 
viettr, u:=t—r, (18) 
and find 
ds? = —dvdu + aC —u)*dN?. (19) 


In Fig. 5 we show the Minkowski spacetime in terms of the new coordinates. Incom- 
ing photons, that is, point-like particles with velocity 7 = —c = —1, move on paths 
with v = const. Correspondingly, we have for outgoing photons u = const. The 
special relativistic wave equation is solved by any function f(u) and f(v). The sur- 
faces f(w) = const. and f(v) = const. represent the wavefronts which evolve with 
the velocity of light. The trajectory of every material particle with 7 < c= 1 has 
to remain inside the region defined by the surface r = t. In an (r,t)-diagram, this 
surface is represented by a cone, the so-called light cone. Any point in the future 
light cone r = t can be reached by a particle or signal with a velocity less than c. 
A given spacetime point P can be reached by a particle or signal from the spacetime 
region enclosed by the past light cone r = —t. 


1.2.2. Penrose diagram 


We can map, following Penrose, the infinitely distant points of spacetime into finite 
regions by means of a conformal transformation which leaves the light cones intact. 
Then we can display the whole infinite Minkowski spacetime on a (finite) piece of 
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Fig. 5. Minkowski spacetime in null coordinates. 


paper. Accordingly, introduce the new coordinates 
T T 
v:=arctanv, t:=arctanu, for — < (0,t) < +5: (20) 
Then the metric reads 
1 1 


gt de oe Bie 
ds? = cos? 0 cos? t — zon’ @ ii)d? , (21) 


We can go back to time- and space-like coordinates by means of the transformation 


t:=t+%, F:=d-4G, (22) 
see (18). Then the metric reads 


_f2 & dx2 1 ein? aqQ2 
ee dt + dr bs sin Mee . (23) 


that is, up to the function in the denominator, it appears as a flat metric. Such a 
metric is called conformally flat (it is conformal to a static Einstein cosmos). The 
back-transformation to our good old Minkowski coordinates reads 


fie. t-# 
(an 5 “+ tan 5 "), (24) 


1 t+? t-7 
r= 5 (tan 5 —tan 5 (25) 
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r = const 
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Fig. 6. Penrose diagram of Minkowski spacetime. 


Our new coordinates t,7# extend only over a finite range of values, as can be seen 
from (24) and (25). Thus, in the Penrose diagram of a Minkowski spacetime, see 
Fig. 6, we can depict the whole Minkowski spacetime, with a coordinate singularity 
along 7 = 0. All trajectories of uniformly moving particles (with velocity smaller 
than c) emerge from one single point, past infinity J~, and all will eventually arrive 
at the one single point J+, namely at future infinity. All incoming photons have 
their origin on the segment Z~ (script J~ or “scri minus”), light-like past-infinity, 
and will run into the coordinate singularity on the t-axis. All outgoing photons arise 
from the coordinate singularity and cease on the line Z‘, light-like future infinity 
(“scri plus”). The entire spacelike infinity is mapped into the single point J°. For 
later reference we collect these notions in a table (Table 1). 

Now, we have a really compact picture of the Minkowski space. Next, we would 
like to proceed along similar lines in order to obtain an analogous picture for the 
Schwarzschild spacetime. 


Table 1. The different infinities in Penrose diagrams. 


I~ Timelike past infinity Origin of all particles 

I+ Timelike future infinity Destination of all particles 
Io Spacelike infinity Inaccessible for all particles 
a Lightlike past infinity Origin of all light rays 


Tee Lightlike future infinity Destination of all light rays 
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1.3. Einstein’s field equation 


We display our notations and conventions for the differential geometric tools used 
to formulate Einstein's field equation. 

We assume that our readers know at least the rudiments of general relativity 
(GR) as represented, for instance, in Einstein’s Meaning of Relativity,°° which we 
still recommend as a gentle introduction into GR. More advanced readers may then 
want to turn to Rindler!® and/or to Landau—Lifshitz.1°° 

We assume a 4D Riemannian spacetime with (Minkowski-)Lorentz signature 
(—++-+), see Misner, Thorne, and Wheeler.!2” Thus, the metric field, in arbitrary 
holonomic coordinates «”, with w = 0,1, 2,3, reads 


g = ds = g,,dz" @ dz’. (26) 


By partial differentiation of the metric, we can calculate the Christoffel symbols 
(Levi-Civita connection) 


1 
Mag = 59°" (Oagpy + O89ya — Oy9as)- (27) 


This empowers us to determine the geodesics (curves of extremal length) of the 
Riemannian spacetime 


Dx _ xX dx! dx” 


Dr2 ~ dr? tN aw dr dr = ee) 


This equation can be read as a vanishing of the 4D covariant acceleration. If we 
define the four-velocity u® := dx°/dr, then the geodesics can be rewritten as 


Du* — du® 
= —4T°%,,ulu’ = 0. 29 
Dr dt a 22) 


In a neighborhood of any given point in spacetime, we can introduce Riemannian 
normal coordinates, which are such that the Christoffels vanish at that point. In 
order to find a tensorial measure of the gravitational field, we have to go one 
differentiation order higher. By partial differentiation of the Christoffels, we find 
the Riemann curvature tensor? 


RY vag = 2(Aal™ v 1g] + TY slel  palei): (30) 


The curvature is doubly antisymmetric, its two index pairs commute, and its totally 
antisymmetric piece vanishes 


Ri) ap = 0, Ryv(ap) =0;  Rywvas = Rosy; Rivas] = 0. (31) 


bAlways symmetrizing of indices is denoted by parentheses, (a3) := {a3 + Ga}/2!, antisym- 
metrization by brackets [a@] := {a8 — a3}/2!, with corresponding generalizations (aBy) := 
{aBy + Bya + ya3 +---}/3!, etc. Indices standing between two vertical strokes | | are excluded 
from the (anti)symmetrization process, see Schouten.!7° 
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If we define collective indices A, B,...=1,...,6 for the antisymmetric index pairs 
according to the rule {01, 02,03; 23,31,12} — {1,2,3;4,5,6}, then the algebraic 
symmetries of (31) can be rephrased as 


Rap =Rea,  trace(Rapg) = 0. (32) 


Thus, in 4D the curvature can be represented as a trace-free symmetric 6 x 6-matrix. 
Hence, it has 20 independent components. 

With the curvature tensor, we found a tensorial measure for the gravitational 
field. Freely falling particles move along geodesics of Riemannian spacetime. What 
about the tidal accelerations between two freely falling particles? Let the “infinites- 
imal” vector n® describe the distance between two particles moving on adjacent 
geodesics. A standard calculation,!?” linear to the order of n, yields the geodesic 
deviation equation 


= uPuT R%g,6n?. (33) 


This equation describes the relative acceleration of neighboring particles, similar as 
(8) and (9) in the Newtonian case. The role of the tidal matrix Kay is taken over 
by Ks = uPu R° gays. 

By contraction of the curvature, we can define the second rank Ricci tensor Ry, 
and the curvature scalar R, respectively 


Row nea RS GP Rav (34) 


For convenience, we can also introduce the Einstein tensor G,, := Ryy — 5 Iv R. 
The curvature with its 20 independent components can be irreducibly decomposed 
into smaller pieces according to 20 = 10+. 9+1. The Weyl curvature tensor Cagyo is 
trace-free and has 10 independent components, whereas the trace-free Ricci tensor 
has nine components and the curvature scalar just 1. 

Now we have all the tools for displaying Einstein’s field equation. With G as 
Newton’s gravitational constant and c as velocity of light, we define Einstein’s grav- 
itational constant « := 87G/c*. Then, the Finstein field equation with cosmological 
constant A reads 

Ruy — saw + Aguy = KT py. (35) 
The source on the right-hand side is the energy-momentum tensor of matter. The 
vacuum field equation, without cosmological constant, simply reduces to Ry, = 0. 
Mostly this equation will keep us busy in this paper. A vanishing Ricci tensor 
implies that only the Weyl curvature Cagys 4 0. Accordingly, the vacuum field in 
GR (without A) is represented by the Weyl tensor. 

Equation (35) represents a generalization of the Poisson equation (10). There, 
the contraction of the tidal matrix is proportional to the mass density. In GR, the 
contraction of the curvature tensor is proportional to the energy-momentum tensor. 

The physical mass is denoted by MW. Usually, we use the mass parameter, m : 


GM’ The Schwarzschild radius reads rg := 2m = 2G“. Usually we put c = 1 
Cc c 
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and G = 1. We make explicitly use of G and c as soon as we stress analogies to 
Newtonian gravity or allude to observational data. 


2. The Schwarzschild Metric (1916) 


Spatial spherical symmetry is assumed and a corresponding exact solution for Ein- 
stein’s theory searched for. After a historical outline (Sec. 2.1), we apply the equiv- 
alence principle to a freely falling particle and try to implement that on top of the 
Minkowskian line element. In this way, we heuristically arrive at the Schwarzschild 
metric (Sec. 2.2). In Sec. 2.3, we display the Schwarzschild metric in sia differ- 
ent classical coordinate systems. We outline the concept of a Schwarzschild black 
hole in Sec. 2.4. In Secs. 2.5 and 2.6, we construct the Penrose diagram for the 
Schwarzschild(-Kruskal) spacetime. We add electric charge to the Schwarzschild 
solution in Sec. 2.7. The interior Schwarzschild metric, with matter, is addressed 
in Sec. 2.8. 


It is quite a wonderful thing that from such an abstract idea the explanation 
of the Mercury anomaly emerges so inevitably. 


Karl Schwarzschild! (1915) 


2.1. Historical remarks 


The genesis of the Schwarzschild solution (1915/16) is described. In particular, we 
show that Droste, a bit later than Schwarzschild, arrived at the Schwarzschild metric 
independently. He put the Schwarzschild solution into that form in which we use it 
today. 

The first exact solution of Einstein’s field equation was born in hospital. Unfor- 
tunately, the circumstances were more tragic than joyful. The astronomer Karl 
Schwarzschild joined the German army right at the beginning of World War I and 
served in Belgium, France and Russia. At the end of the year 1915, he was admit- 
ted to hospital with an acute skin disease. There, not far from the Russian front, 
enduring the distant gunfire, he found time to “stroll through the land of ideas” 
of Einstein’s theory, as he puts it in a letter to Einstein® dated 22 December 1915. 
According to this letter, Schwarzschild started out from the approximate solution 
in Einstein’s “perihelion paper”, published November 25th. Since presumably let- 
ters from Berlin to the Russian front took a few days, Schwarzschild!”? found the 
solution within about a fortnight. Fortunately, the premature field equation of the 
“perihelion paper” is correct in the vacuum case treated by Schwarzschild. 


©The letters from and to Einstein can be found in Einstein’s Collected Works,°! see also 
Schwarzschild’s Collected Works.17! 


Schwarzschild and Kerr solutions of Einstein’s field equation J-121 


In February 1916, Schwarzschild!”? submitted the spherically symmetric solu- 


tion with matter — the “interior Schwarzschild solution” — now based on Einstein’s 
final field equation. In March 1916, he was sent home where he passed away on 11 
May 1916. 


The field equation used by Schwarzschild requires detg = —1. To fulfill this 
condition, he uses modified polar coordinates (Schwarzschild’s original notation 
used), 


%2=-—cosé, r3=¢, 4 =t. 


The spherically symmetric ansatz then reads 


dx3 
ds? = fadx? fidx} fe ia 3 fadx3(1 = ae), 
2 


where f; to f4 are functions of x; only. The solution turns out to be 


1 1 


2 Ra (+08), 
In this paper, as well as in his letter to Einstein, he eventually returns to the usual 
spherical polar coordinates, 
dR? 
ds? = (1-5) at? a — R?(d6? + sin? Odd?), R= (r? 40°)". 
| eee 
R 


This looks like the Schwarzschild metric we are familiar with. One should note, 
however, that the singularity at R = a is (as we know today) a coordinate sin- 
gularity, it corresponds to r = 0. In the early discussion, the meaning of such a 
singularity was rather obscure. Flamm® in his 1916 article on embedding constant 
time slices of the Schwarzschild metric into Euclidean space mentions “the oddity 
that a point mass has an finite circumference of 27a”. 

In 1917, Weyl!®® talks of the “inside” and “outside” of the point mass and 
states that “in nature, evidently, only that piece of the solution is realized which 
does not touch the singular sphere.” In Hilbert’s®® opinion, the singularity R = a 
indicates the illusiveness of the concept of a pointlike mass. A point mass is just the 
limiting case of a spherically symmetric mass distribution. Illuminating the interior 
of “Schwarzschild’s sphere” took quite a while and it was the discovery of new 
coordinates which brought first elucidations. Lanczos,!°* in 1922, clearly speaks 
out that singularities of the metric components do not necessarily have physical 
significance since they may vanish in appropriate coordinates. However, it took 
another 38 years to find a maximally extended fully regular coordinate system for 
the Schwarzschild metric. We will become acquainted with these Kruskal/Szekeres 
coordinates in Sec. 2.5. 
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Schwarzschild’s solution, published in the widely read minutes of the Prus- 
sian Academy, communicated by Einstein himself, nearly instantly triggered fur- 
ther investigations of the gravitational field of a point mass. Already in March 
1916, Reissner,!®* a civil engineer by education, published a generalization of the 
Schwarzschild metric, including an electrical charge; this was later completed by 
Weyl'®8 and by Nordstrém.!4° Today it is called Reissner-Nordstrém solution. 

Nevertheless, one should not ignore the Dutch twin of Schwarzschild’s solution. 
On 27 May 1916, Droste*” communicated his results on “the field of a single centre 
in Einstein’s theory of gravitation, and the motion of a particle in that field” to the 
Dutch Academy of Sciences. He presents a very clear and easy to read derivation 
of the metric and gives a quite comprehensive analysis of the motion of a point 
particle. Since 1913, he had been working on general relativity under the supervi- 
sion of Lorentz at Leiden University. Published in Dutch, Droste’s results are fairly 
unknown today. Einstein, probably informed by his close friend Ehrenfest, rather 
appreciated Droste’s work, praising the graceful mathematical style. Weyl!®® also 
cites Droste, but in Hilbert’s®® second communication the reference is not found. 
Einstein, Hilbert, and Weyl always allude to “Schwarzschild’s solution” . 

After Droste took his Ph.D. in 1916, he worked as school teacher and eventually 
became professor for mathematics in Leiden. He never resumed his work on Ein- 
stein’s theory and his name faded from the relativistic memoirs. In Leiden, people 
like Lorentz, de Sitter, Nordstr6ém, or Fokker learned about the gravitational field of 
a point mass primarily from Droste’s work. Thus, the name “Schwarzschild—Droste 
solution” would be quite justified from a historical point of view. 

The importance of the Schwarzschild metric is made evident by the Birkhoff'® 
theorem?: For vanishing cosmological constant, the unique spherically symmet- 
ric vacuum spacetime is the Schwarzschild solution, which can be expressed most 
conveniently in Schwarzschild coordinates, see Table 3, entry 1. Thus, a spheri- 
cally symmetric body is static (outside the horizon). In particular, it cannot emit 
gravitational radiation. Moreover, the asymptotic Minkowskian behavior of the 
Schwarzschild solution is dictated by the solution itself, it is not imposed from 
the outside. 


2.2. Approaching the Schwarzschild metric 


We start from an ansatz for the metric of an accelerated motion in the radial direc- 
tion and combine it, in the sense of the equivalence principle, with the free-fall veloc- 
ity of a particle in a Newtonian gravitational field. In this way, we find a curved 
metric that, after a coordinate transformation, turns out to be the Schwarzschild 
metric. 


dThe “Birkhoff” theorem was discovered by Jebsen,9° Birkhoff,!® and Alexandrow.* For more 
details on Jebsen, see Johansen and Ravndal.9® The objections of Ehlers and Krasiniski*® appear 
to us as nitpicking. 
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Einstein, in his 1907 Jahrbuch article,*? suggests the generalization of the rela- 
tivity principle to arbitrarily accelerated reference frames. 

A plausible notion of a (local) rest frame in general relativity is a frame where 
the coordinate time is equal to the proper time (for an observer spatially at rest, 
of course). For a purely radial motion, the following metric would be an obvious 
ansatz, see also Ref. 185: 


ds* = —dt? + [dr + f(r)dt}? +r7dQ?, with dQ? := dé? + sin? Od¢”. 


For dé = 0, dO = 0, and dr/dt = —f(r), we have ds? = —dt?. Thereby, —f(r) is 
identified as a kind of “radial infall velocity”. Note also that constant time-slices, 
dt = 0, are Euclidean. 

In Newtonian gravity, a particle falling from infinity toward the origin picks up 
a velocity 


d 2GM 1 GM 
ais —/28(r) = — & =mv"(r) = mO(r) = m—. (36) 
dt (a 2 r 
Here, ® is the absolute value of the Newtonian potential of a spherical body with 
mass VM. 
Hence, in some Newtonian limit, we demand f(r) — V2®. This leads to the 
metric 


ds* = —dt? + (dr + ./2ydt)? + r?dN?, (37) 


where we allow for an arbitrary potential 7 = w(r). This metric generates curvature. 
The calculations can be conveniently done even by hand. The Ricci tensor reads 
1 20,.(r 
Re" = fy’ = —0,0,(r) =0, Ry? = Ry? = ae = 0. 
if r 
The equations Ro? = 0 = R,! are mere integrability conditions of the Ro? = 0 = 
R3? relations. Hence, ry is determined by its first order approximation alone and 
reads 
a 
yp = 
r 


with @ as an unknown constant so far. By construction, we have 


—_— 2y = go Ae | 
dt r r 


The metric (37), expanding the parenthesis and collecting the terms in front of 


dt”, reads 
ds? = (1 26 ) ae 24f 2O™ ata t dr? + r7d0?. (38) 
r r 
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Using different methods, this metric was derived by Gullstrand’”? in May 1921. 
Gullstrand claimed to have found a new spherically symmetric solution of Einstein’s 
field equation. In his opinion,® this showed the ambiguity of Einstein’s field equation. 
However, the metric is of the form 


ds? = —Adt? + 2Bdtdr + dr? +a, A=1-22M gp. f2GM 
- r 


and can be diagonalized by completing the square via 
Bg \" B? 
ds? =—A (a iar) (1 ar ) dr? + r?d?. 


Introducing a new time coordinate, 


B 
dts := dt — hal 


ts =t— (ory 2GM — 4GM Artanhy/ 2a) ; 
r r 
we arrive at (A and B re-substituted) 
-1 
ds* = (1 —) dite, 4 (1 —) dr? + r*dQ?. 
r r 


In contrast to what Gullstrand was aiming at, he “just” rederived the Schwarzschild 
metric. 


or, explicitly, 


Later, applying a coordinate transformation to the Schwarzschild metric, 
Painlevé'” obtained the metric (38) independently and presented his result in 
October 1921. His aim was to demonstrate the vacuity of ds? by showing that an 
exact solution does not determine the physical geometry and is therefore meaning- 
less. In a letter (7th December, 1921) to Painlevé, Einstein stresses on the contrary 
the meaninglessness of the coordinates! In the words of Einstein himself (our trans- 
lation): 
claim an objective meaning.” 


a4 


... merely results obtained by eliminating the coordinate dependence can 


In the subsequent section, we will meet the Schwarzschild metric in many dif- 
ferent coordinate systems. All of them have their merits and their shortcomings. 

Using Gullstrand—Painlevé coordinates for the Schwarzschild metric does not 
change the physics, of course. However, as a coordinate system it is what Gustav 
Mie!?° calls a sensible coordinate system. In contrast to many other coordinate 
systems, the physics looks quite like we are used to. As an example, we analyze 
the motion of a radial infalling particle in Schwarzschild and Gullstrand—Painlevé 
coordinates. 


°Gullstrand, who was a member of the Nobel committee, was responsible that Einstein did not 
get his Nobel prize for relativity theory. He thought that GR is untenable. 
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The equations of motion for point particles in general relativity are obtained 
via the geodesic equation (28). It can be shown that this equation is equivalent to 
the solution of the variational principle 6 Ta da = 0 | &°4° gagdrt?. We choose the 
proper time 7 for the parametrization of the curve, the dot denotes the derivative 
with respect to 7. In the present context, we are only interested in the velocity of 
particles along ingoing geodesics (“freely falling particles”). For time-like geodesics 


we have —1 = -. This allows the algebraic determination of 7 provided we know 


t. Since we consider static metrics here, t is a cyclic variable and (3 4s) =k = 
const. The constant is determined by the boundary condition * = 0 for r — co. 

The difference between the coordinate systems appears in the first line of 
Table 2: In Gullstrand—Painlevé coordinates, the coordinate velocity of a freely 
infalling particle increases smoothly toward the center. Nothing special happens at 
r = 2GM. From a given position, the particle will plunge into the center in a finite 
time. Even numerically this looks quite Newtonian. In contrast, the velocity with 
respect to Schwarzschild coordinates approaches zero as the particle approaches 
r = 2GM. Hence, the particle apparently will not be able to go further than 
r=2GM. 

For the Gullstrand—Painlevé metric for incoming light the radial coordinate 
velocity is always larger in magnitude than —1, at r= 2GM it is —2, for outgoing 
rays it vanishes at r = 2GM and is negative for r < 2GM. 

Taking the mere numerical values is misleading. Contemplate for incoming light 


(a) 

dt particle = 1 
= oa 
dt light 2GM 


So the particle is always slower than light, however it approaches the velocity of 
light when approaching r = 0. 


Table 2. Velocities in different coordinate systems.’ 


Schwarzschild Gullstrand—Painlevé 


Particles 


Coordinate velocity a +(1 2GM), | aot =e [2G 
Proper velocity oa + i 4s [26M 


Light rays 
: * d ne 2GM 
Coordinate velocity 7 +(1 = 1-4/> 


fThe velocities of outgoing particles are valid only for the boundary condition specified. The 
coordinate velocity for outgoing particles in GP coordinates does not fit in our table and is thus 
suppressed. 
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The Gullstrand—Painlevé form of the metric is regular at the surface r = 2GM. 
This shows that it is not any kind of barrier, but this observation was not made 


until much later, see Eisenstadt.°? 


2.3. Six classical representations of the Schwarzschild metric 


As we mentioned, a coordinate system should be chosen according to its convenience 
for describing a certain situation. In the following table (Table 3), we collect six 
widely used forms of the Schwarzschild metric. 


Table 3. The Schwarzschild metric in various coordinates. 


Schwarzschild metric in Coordinate Characteristic 
various coordinates transformation properties 
Schwarzschild (t,r, 0, 0) — Area of spheres 


= t. is the 
ds? = —(1— 2m dt2 r cons 
° ( r ) “Euclidean” 4rr? 


+ [ome dr? + r2dQ? 


Isotropic (t, 7,0, ) 
a) 
ds? = — ( tania) dt? rT (1 + mm)? Constant-curvature 
re time slices 

+ (1+ 3) 

x (dr? + 77d0?) 
Eddington—Finkelstein (u,r, 9, d) 
ds? = -(1 - 2m) dy? v=t+r+2m Ingoing light rays: 

+ 2dudr + r2dQ? xIn|so -1| dv =0 
Kerr—Schild (€, 2, y, 2) 
ds? = (Nag +2mlalg)dx“dx?; t=vu-r “Cartesian” 
i= (1 a. Sw ) r2 =a? +y2 +22 coordinates 

Tr a oN A rae a 
Lemaitre (T,R,0,¢) dT =dt+ 2m, mar Infalling particles: 
ds? = —dT? + 2™dR? +r7d0?, A dk =0 
r " d= dt 4/55 jan 
P= [= pala (R a, a 
Gullstrand—Painlevé (t,r,0, 0) 
ds? = (1 - 2m) qf? dt = dt — —" —— Infalling particles: 
2 V3m Vor dr = —,/2m gf 
+ 2/2 didr r 


+ dr2 + r2dQ? 


2.4. The concept of a Schwarzschild black hole 


We first draw a simple picture of a black hole. The event horizon and the stationary 
limit emerge as characteristic features. These are subsequently defined in a more 
mathematical way. 
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In 1783, John Michell communicated his thoughts on the means of discovering 
the Distance, magnitude, etc. of the fixed stars, in consequence of the diminuation 
of the velocity of their light ...1?° to the Royal Society in London. In the context 
of Newton’s particle theory of light, he calculated that sufficiently massive stars 
exhibit a gravitational attraction to such vast an amount that even light could not 
escape. A few years later (1796) Pierre-Simon Laplace published similar ideas. 

In modern notation, we may reconstruct the arguments as follows. We throw a 
mass m from the surface of the Earth, assuming that there were no air, in upward 
direction with an initial velocity v. It will always fall back, unless its initial velocity 
reaches a sufficiently high value vescape providing the mass with such a kinetic energy 
that it can overpower the gravitational attraction of the Earth. Energy conservation 
yields then immediately the formula 


26M» 
Vescape = ~ Bes? 


where G is Newton’s gravitational constant and M and Reg the mass and the radius 
of the spherically conceived Earth, respectively. 

For the Earth we find vescape © 11.2 km/s. If we now compress the Earth appre- 
ciably (thought experiment!) until the escape velocity coincides with the speed of 
light Vescape = €, its compressed “Schwarzschild” radius becomes rg = 2GM @/ Cr 
lcm. For the Sun, with its mass Mo, we have® 


2GMo 


5S 3km 


je = 
é 


At any smaller radius, the light will be confined to the corresponding body. This is 
an intuitive picture of a spherically symmetric invisible “black hole” .” 

It is very intriguing to see how far-sighted Michell anticipated the status of 
today’s observational black hole physics: 


If there should really exist in nature bodies, whose density is not less than 
that of the sun, and whose diameters are more than 500 times the diameter 
of the sun, since their light could not arrive at us; |...] we could have no 
information from sight; yet, if any other luminous bodies should happen 
to revolve about them we might still perhaps from the motions of these 
revolving bodies infer the existence of the central ones with some degree of 
probability ... 


8For the sake of clarity, we display here the speed of light c explicitly. 

hThe phrase “black hole” is commonly associated with Wheeler (1968). It appears definitely earlier 
in the literature: In the January 1964 edition of the Science News Letter, the journalist Ann Ewing 
entitled her report at the meeting of the American Association for the Advancement of Science 
in Cleveland “Black Holes” in space. And if you have a look into an arbitrary English language 
dictionary published before ca. 1970, you will learn that “black hole” refers to a notorious dungeon 
in Calcutta (now Kolkata) in the 18th century, apparently a place of no return. 
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This could be a verdict on the current observations of the black hole Sgr A* 
(“Sagittarius A-star”) in the center of our Milky Way — and this is not a thought 
experiment — for a popular account, see Sanders.!®° Ser A* has a mass of about 
4 x 10°Mo. Thus, its Schwarzschild radius is far from being minute, it is about 
3 x 4 x 10°km or about 17 solar radii. 

A cautionary remark has to be made, though, see Penrose.!°° Newtonian gravity 
chas no absolute meaning like in special relativity. It is conceivable that the speed of 
light in strong Newtonian gravitational fields could be larger than c. Consequently, 
the Michell type argument becomes only pertinent if c is the maximal speed for 
all phenomena like in the Minkowski space of special relativity, or, if gravity is 
involved, in the Riemannian space of GR. 

Let us follow the way of visualizing the black hole concept by means of everyday 
physics a bit further: We explore the Schwarzschild and, later in Sec. 3.4, the Kerr 
spacetime by boat. Schwarzschild spacetime is mimicked by a hole in a lake in which 
the surrounding water plunges simply radially without whirling around (Monticello 
Dam, California). The water flowing toward the hole will drag our boat to the 
center. Our boat may move around quite freely as long as the current is weak. 

However, at some distance from the hole, the current becomes so strong that 
our boat, engines working at their maximum power, merely can keep its position. 
This is the stationary limit. In the case of our circularly symmetric water hole the 
stationary limit forms a ring. Bad for the boat: The stationary limit is also the 
ring of no return. At best, the boat remains at its position, it never will escape. 
Any millimeter across the stationary limit will doom the boat, it will be inevitably 
sucked into the throat. Accordingly, the stationary limit coincides in this spherically 
symmetric case with the so-called event horizon. 


2.4.1. Event horizon 


In 1958, Finkelstein®® characterized the surface r = 2m as a “semi-permeable mem- 
brane” in spacetime, that is, a surface which can be crossed only in one direction. 


Fig. 7. Not quite seriously: “Schwarzschild” (left) versus “Kerr” (right). (For color version, see 
page I-CP2.) 
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As soon as our boat has passed the event horizon, it can never come back. This 
property can be formulated in an invariant way: The light cones at each point of the 
surface have to nestle tangentially to the membrane. In 1964, Penrose!*® termed 
the null cone which divides observable from unobservable regions an event horizon. 
Mathematically speaking, the event horizon is characterized by having tangent vec- 
tors which are light-like or null at all points. Therefore, the event horizon is a null 
hypersurface. This is what is meant by a trapped surface,?® see Figs. 8 and 9, left 
image: a compact, spacelike, two-dimensional submanifold with the property that 
outgoing future-directed light rays converge in both directions everywhere on the 
submanifold. All these characterizations quite intuitively show up in the Penrose— 
Kruskal diagram to be discussed later. 

In view of the preceding paragraph, we define a black hole as a region of space- 
time separated from infinity by an event horizon, see Carroll?® and Brill.?? 
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light cone 


time 


line of contact 


Fig. 8. A null hypersurface is not necessarily an event horizon: Imagine a light cone that touches 
a hypersurface along the line of contact. Thus, the light cone is tangent as well as normal (in a 
spacetime sense) to the surface. Consequently, all such surfaces are null hypersurfaces. In the cases 
A and B, the light cone is entirely trapped inside the surface. Case A suggests that the surface 
does not close in a finite region, therefore the enclosed volume is not compact. Case B represents 
a (part of a) circle, which encloses all tangential light cones, and this forms an (black hole) event 
horizon. In case C, the light cone intersects the hypersurface. The white domain is outside the 
null surface but inside the light cone and, thus, reachable from within the enclosed domain. 
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t’ 


r=2m 


Fig. 9. In- and outgoing Eddington—Finkelstein coordinates (where we introduce t’ with v = t/+r, 
u=t' —r). The arrows indicate the direction of the original Schwarzschild coordinate time (and 
thereby the direction of the Killing vector 0;). The left figure illustrates a black hole: All incoming 
photons traverse the event horizon and terminate in the singularity. The right figure illustrates a 
white hole: All outgoing photons emerge from the singularity, cross the horizon, and propagate 
out to infinity. 


Observational evidence in favor of black holes was reviewed by Narayan and 
McClintock.!2° 


2.4.2. Killing horizon 


The stationary limit surface is rendered more precise in the notion of a Killing 
horizon. A particle at rest (with respect to the infinity of an asymptotically flat, 
stationary spacetime) is to be required to follow the trajectories of the timelike 
Killing vector.' However, if we have a Killing vector K describing a stationary 
spacetime, then at some points K may become lightlike, that is kK’ kK, = 0. If 
all these points build up a hypersurface /, then this null hypersurface is called a 
Killing horizon. Apparently, this notion is of a local character, in contrast to the 
definition of an event horizon, the definition of which refers to events in the future, 
it is of a nonlocal character, see Fig. 8. 

As we will see for the Schwarzschild black hole, see Fig. 9, outside the black hole 
the Killing vector is timelike, that is, kA’ K,, < 0, on the Killing horizon it becomes 


iUsing the definitions of the covariant derivative and of the Christoffel symbols, we can derive the 
following equation for an arbitrary vector K, 


KOaguv = 2V (uv) = 29a(pOv) K~. (39) 


Assuming K® and gy, to be constant in time, demands V(,,K,) = 0. Hence K has to be a Killing 
vector. In this coordinate system, we have K* Ka = goo. Although K acts as time translation, it 
is not necessarily timelike! 
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null A“ K,, = 0 (by definition of the horizon), and inside it becomes spacelike 
KYK, > 0. 

In the Schwarzschild case it will turn out that the event horizon and Killing 
horizon coincide, in the Kerr case they separate. 


2.4.3. Surface gravity 
From the definition of the Killing horizon it can be shown?? that the quantity 


= 5 (VpKy (VK (40) 


is constant on the Killing horizon and positive. The quantity « is called surface 
gravity. In simple cases, it has the interpretation of an acceleration or gravitational 
force per unit mass on the horizon. In the Schwarzschild spacetime it takes the 
value «& = 1/4m, which is the acceleration of a particle with unit mass as seen from 
infinity, compare with the Newtonian “field strength” (2) for r = 2m: 


fe a (41) 


r2 (2m)? 4m 


In general, there is no such simple interpretation. 


2.4.4. Infinite redshift 


Another property associated with the surface kK“ K, = 0 is the infinite redshift. In 
view of the relation for the general relativistic time delay, 


git (XB) 


V Git (x) 


git — 0 can be interpreted as follows. Consider 7o(xg) the time measured by a 
clock B resting well away from the Killing horizon, whereas clock A with T9(x,) is 
nearly at the Killing horizon. If g(x) — 0 we get 7(xg) — oo. From the point 
of view of clock B, clock A’s last signal, right before A hits the Killing horizon, will 
not reach B in a finite time, that is, never. To put it a little bit different: Signals 
sent with respect to A with constant frequency arrive increasingly delayed at B. 
For B the frequency approaches zero. This is called infinite redshift. 

Let us work out these ideas for the Schwarzschild solution and let us take “pho- 
tons” in spacetime instead of boats on a lake. 


To(XB) = T(x). 


2.5. Using light rays as coordinate lines 


Schwarzschild coordinates exhibit a coordinate singularity at r = 2m. This obstructs 
the discussion of the event horizon considerably. As we have seen, light rays pen- 
etrate the horizon without difficulty. This suggests to use light rays as coordinate 


J-132 C. Heinicke and F. W. Hehl 


lines. Therefore, we introduce in- and outgoing Eddington—Finkelstein coordinates. 
By combining both, we arrive at Kruskal-Szekeres coordinates, which provide a reg- 
ular coordinate system for the whole Schwarzschild spacetime. 


2.5.1. Eddington—Finkelstein coordinates 


In relativity, light rays, the quasi-classical trajectories of photons, are null geodesics. 
In special relativity, this is quite obvious, since in Minkowski space the geodesics are 
straight lines and “null” just means v = c. A more rigorous argument involves the 
solution of the Maxwell equations for the vacuum and the subsequent determination 
of the normals to the wave surface (rays) which turn out to be null geodesics. 
This remains valid in general relativity. Null geodesics can be easily obtained by 
integrating the equation 0 = ds. We find for the Schwarzschild metric, specializing 
to radial light rays with dé = 0 = dé 


pad (r ! 2m In| -— 1) + const. (42) 


If we denote with ro the solution of the equation r + 2mIn|;= — 1| = 0, we have 
for the t-coordinate of the light ray t(ro) =: v. Hence, if r = ro, we can use v to 
label light rays. In view of this, we introduce v and u 


vi=t+r+2min (43) 


Z 
u, 
= 


. 
= 2 In| 1. 4A 
u r—2mIn|5— (44) 


Then ingoing null geodesics are described by v = const., outgoing ones by u = 
const., see Fig. 9. We define ingoing Eddington—Finkelstein coordinates by replacing 
the “Schwarzschild time” t by v. In these coordinates (v, 1, 0, ¢), the metric becomes 


2 
ds? = (1 =) dy” + 2dvdr + r7dQ?. (45) 


r 


For radial null geodesics ds? = dO = dd = 0, we find two solutions of (45), namely 
v = const. and v = 4m In|r/2m — 1| + 2r + const. The first one describes infalling 
photons, i.e. t increases if r approaches 0. At r = 2m, there is no singular behav- 
ior any longer for incoming photons. Ingoing Eddington—Finkelstein coordinates 
are particular useful in order to describe the gravitational collapse. Analogously, 
for outgoing null geodesics take (u,7r,0,¢) as new coordinates. In these outgoing 
Eddington—Finkelstein coordinates the metric reads 


2 
ds? = (1 =) du* — 2dudr + r?dQ?. (46) 


r 
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Outgoing light rays are now described by u = const., ingoing light rays by u = 
—(4m In|r/2m — 1| + 2r) + const. In these coordinates, the hypersurface r = 2m 
(the “horizon”) can be recognized as a null hypersurface (its normal is null or 
lightlike) and as a semi-permeable membrane. 


2.5.2. Kruskal-Szekeres coordinates 


Next we try to combine the advantages of in- and outgoing Eddington—Finkelstein 
coordinates in the hope to obtain a fully regular coordinate system of the 
Schwarzschild spacetime. Therefore, we assume coordinates (u, v, 0,6). Some (com- 
puter) algebra yields the corresponding representation of the metric: 


2m 
2_ ime 2 
ds* = (1 | dudv + r*(u, v)dQ*. (47) 


Unfortunately, we still have a coordinate singularity at r = 2m. We can get rid of 
it by reparametrizing the surfaces u = const. and v = const. via 


i exp(—) , vw -exp(—7=) ‘ (48) 


In these coordinates, the metric reads [r = r(@,0) is implicitly given by (48) and 
(44), (43), rg = 2m] 


4r3 vn) 
ds? = —- a exp( ee) avai + 72 (a, a) ae (49) 
Again, we go back from w and @ to time and spacelike coordinates 
f= 5+, f= 50-0) (50) 
= 5+), Fi= 5 (¥— 4d). 
In terms of the original Schwarzschild coordinates we have 
r i t 
= Hoo) ons ! 
r | exp( 7 ) cosh 7 (51) 
fx |= - Ie (=) sinh —— (52) 
7 2m ~ 4m 4m" 
The Schwarzschild metric 
4r3 Tr 
gals 2) 2) 1) 292 
ds? = 8 exp( 5) (—di? + di?) +7240? (53) 


in these Kruskal-Szekeres coordinates (t,7,9,¢), behaves regularly at the gravita- 
tional radius r = 2m. If we substitute (53) into the Einstein equation (via computer 
algebra), then we see that it is a solution for all r > 0. Equations (51) and (52) yield 


wo 3 r Tr 
in| Alea”), 54 
: 2m = 2m (84) 


Thus, the transformation is valid only for regions with |7| > ¢. However, we can 
find a set of transformations which cover the entire (t, *)-space. They are valid in 
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different domains, indicated here by I, I, III and IV, to be explained below: 


t 
=4 15 1 exp( ) sinh 
4m 


eo t 
r= om - wes cosh 7 
~ r r t 
P= yf1- gp on(ga) ba 
(II) (56) 
e ip 
F=,/1 om exp(z ) sinh 
~ P r : t 
= 5 1 exp(7-) sinh ta 
(11) (57) 
= is r t 
r= = 1 exp(=—) cosh ian, 
i 1 5m exp(z ) cosh Z 
(IV) (58) 
2 r r . t 
an er(z,) ae 


The inverse transformation is given by 


ue - \= 7 
Ge 1) exp(s-) ees 8) 
; Acta, for (I) and (III), 
P 
ia ee) 


Artanhs, for (II) and (IV). 


The Kruskal-Szekeres coordinates G. 7,0, b) cover the entire spacetime, see Fig. 10. 
By means of the transformation equations we recognize that we need two 
Schwarzschild coordinate systems in order to cover the same domain. Regions (I) 
and (III) both correspond each to an asymptotically flat universe with r > 2m. 
Regions (II) and (IV) represent two regions with r < 2m. Since ¢ is a time coordi- 
nate, we see that the regions are time reversed with respect to each other. Within 
these regions, real physical singularities (corresponding to r = 0) occur along the 
curves ¢? — 7? = 1. From the form of the metric we can infer that radial light-like 
geodesics (and therewith the light cones ds = 0) are lines with slope 1. This makes 
the discussion of the causal structure particularly simple. 
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Fig. 10. Kruskal—Szekeres diagram of the Schwarzschild spacetime. 


2.6. Penrose—Kruskal diagram 


We represent the Schwarzschild spacetime in a manner analogous to the Penrose 
diagram of the Minkowski spacetime. To this end, we proceed along the same line 
as in the Minkowskian case. 

First, we switch again to null-coordinates v! = +7 and u! = ¢—# and perform 
a conformal transformation which maps infinity into the finite (again, by means of 
the tangent function). Finally, we return to a time-like coordinate ¢ and a space-like 
coordinate 7. We perform these transformations all in one according to 


> +P 
é+7 = tan = (61) 

~ t—7 
¢—# = tan (62) 

2 
The Schwarzschild metric then reads 
t 
3 exp( ae ( dt? 4 ar) 

ds? = —8- = + r2(E, Pd, (63) 

r(f,t) t+f  »t-F 
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Fig. 11. Penrose—Kruskal diagram of the Schwarzschild spacetime. Region II corresponds to a 
black hole, region IV to a white hole. Regions I and III correspond to two universes. 


where the function r(t,#) is implicitly given by 


(—— -1) exp(s—) = tan 2" tan. (64) 
The corresponding Penrose—Kruskal diagram is displayed in Fig. 11. The notations 
for the different infinities can be extracted from Table 1. In contrast to Minkowski 
space, light rays and particles may not escape to infinity, but enter the black hole 
(II). Likewise, light rays and particles may not emerge from infinity, but from the 
white hole (IV). 


2.7. Adding electric charge and the cosmological constant: 
Reissner—Nordstrom 


As mentioned in the historical remarks, soon after Schwarzschild’s solution, the 
first generalizations, including electric charge and the cosmological constant were 
published. We can be even quicker. We already calculated the Ricci tensor for the 
Gullstrand—Painlevé ansatz. If we use the well-known energy-momentum tensor for 
a point charge,*! the field equation may be written as! 


Vv 1 Vv Vv q 2 
Ru a: Fo, + Aor, = KAg 5 gdiag(—1, —1, 1, iby (65) 
Taking the trace, we find R = 4A and arrive at 
20, (rv) q 
2_ p.3_ _ 
Ro? = R3° = 2 == JN. a (66) 


JEinstein’s gravitational constant is denoted by k, Ao = , ae is the admittance of the vacuum. 
With c= 1 and G=1 we have Ko = 2. 
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This equation can be integrated elementarily 


gq 2a 


ul 2 


= 
This function also solves the remaining two field equations. The integration constant 
a is again the mass m. Substituted into (37) and transformed to Schwarzschild 
coordinates (f = 1 — 2y) the solution reads 


dr? 


ds* = —f(r) dé? + 0) +r7dQ?, (68) 
with 
; 2m | g <A 2 
Hiei ts ak (69) 


A detailed derivation using Schwarzschild coordinates and computer algebra can be 
found in Puntigam et al.1°8 

A discussion of the Reissner—Nordstrém(-de Sitter) solution can be found 
in Griffiths and Podolsky,®? for example. We only remark that we recover the 
Schwarzschild solution for g = 0 and A = 0. The algebraic structure of the solution 
is identical to the Schwarzschild case. Thus, we find, in general, a singularity at 
r = 0. However, a pure cosmological solution, m = 0,q = 0 and A ¥ 0, possesses no 
singularity and no horizon! On the other hand, an electrically charged black hole, 
A = 0, exhibits two horizons, 


f(r) =08Srz=m m? — q?. (70) 


In this respect, the charged black hole shows some similarities to a rotating (Kerr) 
black hole. We will pick up this discussion in Sec. 3.4. 


2.8. The interior Schwarzschild solution and the TOV equation 


In the last section, we investigated the gravitational field outside a spherically 
symmetric mass-distribution. Now it is time to have a look inside matter, see Adler 
et al.' Of course, in a first attempt, we have to make decisive simplifications on the 
internal structure of a star. We will consider cold catalyzed stellar material during 
the later phase of its evolution which can be reasonably approximated by a perfect 
fluid. The typical mass densities are in the range of © 10’ g/cm® (white dwarfs) 
or  10!*g/cm? (neutron stars, e.g. pulsars). In this context, we assume vanishing 
angular momentum. 
We start again from a static and spherically symmetric metric 


ds? = —e4(") ¢2 dt? + eB dr? + r2dQ? (71) 
and the energy-momentum tensor 
Pp 
Tw = C 5) Upty + P9pv, (72) 


where p = p(r) is the spherically symmetric mass density and p = p(r) the pressure 
(isotropic stress). This has to be supplemented by the equation of state which, for 
a simple fluid, has the form p = p(p). 
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We compute the nonvanishing components of the field equation by means of 
computer algebra as (here k = 87G/c* is Einstein’s gravitational constant and 


()' = d/dr) 


—ePxr?c?p +e? + B’r —1=0, (73) 
—eP «pr? —e? + A'r +1=0, (74) 
—4e? kpr + 2A"r + (A')?r — A'B'r + 2A! — 2B’ = 0. (75) 


The (¢, 6)-component turns out to be equivalent to the (@,@)-component. For con- 
venience, we define a mass function m(r) according to 


rs: oe 2m(r) 


(76) 


ir 

We can differentiate (76) with respect to r and find, after substituting (73), a 
differential equation for m(r) which can be integrated, provided p(r) is assumed to 
be known 


m(r) = | 5 pce. (77) 


Differentiating (74) and using all three components of the field equation, we obtain 
a differential equation for A 


, 


2p 

A= - 78 
pt pc? ( ) 

We can derive an alternative representation of A’ by substituting (76) into (74). 
Then, together with (78), we arrive at the Tolman-—Oppenheimer-Volkoff (TOV) 


equation 


Kpr® 
(pc? + p) (m+ 5 ) 


, 


p= 


r(r — 2m) ve) 
Terms that survive in the Newtonian limit are emphasized by boldface letters. 
The system of equations consisting of (77), (78), the TOV equation (79), and the 
equation of state p = p(p) forms a complete set of equations for the unknown 
functions A(r), p(r), p(r), and m(r), with 


dr? 


te 2m(r) 


ds? = —eA(") 2 d¢? + + r7dQ?. (80) 


- 
These differential equations have to be supplemented by initial conditions. 

In the center of the star, there is, of course, no enclosed mass. Hence, we demand 
m(0) = 0. The density has to be finite at the origin, ie. p(0) = p., where p, is 
the density of the central region. At the surface of the star, at r = Ro, we have 
to match matter with vacuum. In vacuum, there is no pressure which requires 
p(Reo) = 0. Moreover, the mass function should then yield the total mass of the 
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star, m(Ro) := GMa/c?. Finally, we have to match the components of the metric. 
Therefore, we have to demand exp[A(ro)| = 1 — 2m(Ro)/Ro. 

Equations (73), (74), (75) and certain regularity conditions which generalize our 
boundary conditions, that is, 


regularity of the geometry at the origin, 
finiteness of central pressure and density, 
positivity of central pressure and density, 
positivity of pressure and density, 


monotonic decrease of pressure and density, 


impose conditions on the functions p and p. Then, even without the explicit knowl- 
edge of the equation of state, the general form of the metric can be determined. For 
recent work, see Rahman and Visser!” and the literature given there. 

We can obtain a simple solution, if we assume a constant mass density 


p = p(r) = const. (81) 


One should mention here that p is not the physically observable fluid density, which 
results from an appropriate projection of the energy-momentum tensor into the 
reference frame of an observer. Thus, this model is not as unphysical as it may look 
at the first. However, there are serious but more subtle objections which we will 
not discuss further in this context. 

When p = const., we can explicitly write down the mass function (77) 


3 a 3 R3 
mr) = a with R= 4/—Z, Mmo:= a (82) 
Re? \ kpc DR? 


This allows immediately to determine one metric function 


es >: (83) 


a r 
= —5 lec’ + p)( 4 SK?) Ss (84) 


It can be elementarily solved by separation of variables 


; FR? — R2, - V R2 — 2 
pl) = pe a (85) 
VR? — 7? — 3y/ R? — R2, 
Using (78) as A’ = —2[In(p+ pe?) and continuous matching to the exterior, 


eventually yields the interior and exterior Schwarzschild solution for a spherically 


1-140 C. Heinicke and F. W. Hehl 


symmetric body!” 


2 
R41 2 1 
a a i edt? + —— dr? +1r7°dQ?, r< Ro, 
R2 Tr 


2 1 
-( 7 me) cdt? 4 adr? + 17dn?, r> Ro. 
© 


(86) 


The solution is only defined for Ro < R. For the Sunk we have Mo & 2 x 10% kg, 
Ro © 7x 108m and accordingly pe © 1.4 10? kg/m?. This leads to Rx 3x10! m, 
that is, the radius of the star Re is much smaller than R: Ro < R. Hence, the square 
roots in (86) remain real. 

The condition Rg < R suggests that a sufficiently massive object cannot be 
stable since no static gravitational field seems possible. This conjecture can be 
further supported. Even before reaching R, the central pressure becomes infinite 


~ 4 
p(0) > co for Ro > (Sa. or Mo—- gto: (87) 


If there is no static solution and the situation remains spherically symmetric, we are 
forced to the conclusion that such a mass distribution must radially collapse. Either 
in an infinite time or to a single point in space. With reasonable simplifications, it 
was first shown by Oppenheimer and Snyder!#° that the second alternative is true: 
A very massive object collapses to a black hole. As various singularity theorems 
show today, this behavior is indeed generic, see Chrusciel et al.°° and Sec. 3.10. 


kTo ascertain the consistency of dimensions and units, we recollect the basic definitions: 


(m/s)* m? 87G 
= — 


G= ' é 
i] N kg s? cA 


The mass M carries the unit kg, the mass parameter has the dimension of a length: 


GM m°kg s? 
m= im 


~ kgs?m2 
The definition of m(r) in Eq. (77) is consistent. For p = const., we have 


K ol, G4 3 GM (r) 
alr) = pe a Sow P= 


Here p denotes the physical mass density, [p] = kg/m?. Thus 


A 
M(r) := 37” P 


is the physical mass with the unit kg. 
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3. The Kerr Metric (1963) 


After some historical reminiscences (Sec. 3.1), we point out how one can arrive at 
the Kerr metric (Sec. 3.2). For that purpose, we derive, in cylindrical coordinates, 
the four corresponding partial differential equations and explain how this proce- 
dure leads to the Kerr metric. In Sec. 3.3, we display the Kerr metric in three 
classical coordinate systems. Thereafter, we develop the concept of the Kerr black 
hole (Sec. 3.4). In Secs. 3.5-3.7, we depict and discuss the geometrical/kinematical 
properties of the Kerr metric. Subsequently, in Sec. 3.8, we turn to the multipole 
moments of the mass and the angular momentum of the Kerr metric, stressing 
analogies to electromagnetism. In Sec. 3.9, we present the Kerr—Newman solu- 
tion with electric charge. Eventually, in Sec. 3.10, we wonder in which sense the 
Kerr black hole is distinguished from the other stationary axially symmetric vacuum 
spacetimes, and, in Sec. 3.11, we mention the rotating disk metric of Neugebauer— 
Meinel as a relevant interior solution with matter. 


... When I turned to Alfred Schild, who was still sitting in the armchair 
smoking away, and said “Its rotating!” he was even more excited than I 
was. I do not remember how we celebrated, but celebrate we did! 


Roy P. Kerr (2009) 


3.1. Historical remarks 


The search for axially symmetric solutions of the Einstein equation started in 1917 
with static and was extended in 1924 to stationary metrics. It culminated in 1963 
with the discovery of the Kerr metric. 

The Schwarzschild solution, as we have seen, describes the gravitational field 
of a spherically symmetric body. Obviously, most planets, moons, and stars rotate 
so that spherical symmetry is lost and one spatial direction is distinguished by 
the three-dimensional angular momentum vector J of the body. Hence, the next 
problem to attack was to search for the gravitational field of a massive rotating 
body. 

When one considers a static and axially symmetric situation — this is the case 
if the body does not carry angular momentum — then one can choose the rotation 
axis as the z-axis of a cylindrical polar coordinate system: «x! 
x? = ¢. Then static axial symmetry means that the components of the metric 
Juv = Guv(Z, Pp) do not depend on the time t and the azimuthal angle @ (we have 
here one timelike and one spacelike Killing vector’). 


= 2, 27 = p and 


‘Remark on Killing vectors: Consider a point P of spacetime with coordinates 7%. We specify a 
direction €" at P. If we have a flat Minkowski space, the components gy, of the metric, given in 
Cartesian coordinates, would not change under a motion in the €-direction. However, in a curved 
spacetime, the g,,, will change in general. If €“ fulfills the Killing equations (see Stephani!” ) 


Vuév + VvEu = 9, (88) 
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Already in 1917, Weyl!®® started to investigate static axially symmetric vacuum 
solutions of Einstein’s field equation. He took cylindrical coordinates and proposed 
the following “canonical” form of the static axisymmetric vacuum line element™: 

aw) 
A 
Here f = f(z, p) and h = A(z, p) and (x9 =t, a2! = z, 2? = p, x? = ¢). Weyl was led, 
in analogy to Newton’s theory, to a Poisson equation and found thereby a family 
of static cylindrically symmetric solutions that could be understood as the exterior 
field of a line distribution of mass along the rotation axis. Similar investigations 
were undertaken by Levi-Civita!®? (1917/19). 

In the year 1918, Lense and Thirring!®” investigated a rotating body. They 
specified the energy-momentum tensor of a slowly rotating ball of matter of homo- 


da” = far = {nae + dp”) + (89) 


geneous density and integrated the Einstein equation in lowest approximation. They 
found, for a ball rotating around the z-axis of a spatial Cartesian coordinate system, 
the linearized Schwarzschild solution in isotropic coordinates, see Table 2, together 
with two new “gravitomagnetic” correction terms in off-diagonal components of the 
metric (« is Einstein’s gravitational constant) 


2M 2M AK Sz 
ds? = (1 = ) ae (1 eas ) a? + dy? + dz”) = (ady — ydx)dt . 
r ? 
—— 


gravitomagnetic term 


linearized Schwarzschild 


(90) 


This is valid for kM <r and kJ, < r?. This gravitomagnetic effect (“the Lense— 
Thirring effect”) is typical for GR: in Newton’s theory a rotating rigid ball has 
the same gravitational field as a nonrotating one. Gravitomagnetism is alien to 
Newton’s gravitational theory. 

In the meantime, the Lense—Thirring effect has been experimentally confirmed 
by the Gravity Probe B experiment, see Everitt et al.°° They took a gyroscope in 
a satellite falling freely around the (rotating) Earth. The spin axis of the gyroscope 
pointed to a fixed guide star. Because of the gravitomagnetic term in (90), the gyro- 
scope executed a (very small) Lense-Thirring precession.” This can be understood 
as an interaction of the spin of the gyroscope with the spin of the Earth (spin-spin 
interaction). Since the gyroscope moves along a 4D geodesic of a spacetime curved 
by the mass of the Earth, an additional geodetic precession occurs that has to be 


with V as covariant derivative operator, then €“ is called a Killing vector, and this vector specifies 
a direction under which the metric does not change. The Schwarzschild metric is static, that is, it 
has one timelike Killing vector along the time coordinate. Furthermore, it is spherically symmetric 
and thus has three additional spacelike Killing vectors. In the Weyl case, because of the axial 
symmetry around the z-axis, two of those spacelike Killing vectors get lost. Left over in the Weyl 
case are the two Killing vectors, one timelike (1)¢# = @ and one spacelike (¢? = Oy: 

™Weyl used p— r, > U. 

"For related experiments, see Ciufolini et al.3435 and Iorio et al.89:92 A recent comprehensive 
review was given by Will.18° A textbook presentation may be found in Ohanian and Ruffini.!4% 
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experimentally separated from the Lense—Thirring term. The geodetic precession 
had already been derived earlier by de Sitter*? in 1916.° 

In spherical polar coordinates we have ydx —xdy = r? sin?0d¢. Thus, the gravit- 
omagnetic cross-term in (90) may be rewritten as (4K.J, sin?0/r)dtd@. A comparison 
with (89) shows that the canonical Wey] form of the static metric is too narrow for 
describing rotating bodies. 

From 1919 on, there appeared further articles on axisymmetric solutions. Levi- 
CivitaP (1919) reacted to Weyl’s article, and Bach® (1922) pushed the Lense— 
Thirring line element to the second order in the approximation. 

Then, in 1924, Lanczos!°° extending the Wey] ansatz, started to investigate sta- 
tionary® solutions. He found an exact solution for uniformly rotating dust. However, 
his work was apparently partially overlooked. Later, Akeley? (1931), Andress® 
(1930) and, in a more definite form, Lewis! (1932) generalized the static Weyl met- 
ric to a stationary one by taking into account the gravitomagnetic term of Lense— 
Thirring. Lewis (1932) wrote, in cylindrical polar coordinates (x1 ~» p,x2 ~~ z), 


ds? = fat? — (eday? + e”daxy? + Idd?) — 2mdtdé. (91) 


He found some exact solutions, typically for rotating cylinders, but not for rotating 
balls. It became definitely clear that, in the axially symmetric case, we may have 
many different exact vacuum solutions, in contrast to the case of spherical symmetry 
with, according to the Birkhoff theorem, the Schwarzschild solution as being unique. 

Not much later, van Stockum!** (1937) determined the gravitational field of an 
infinite rotating cylinder of dust particles, thereby recovering the Lanczos solution, 
inter alia. He fitted one of the interior matter solutions of Lewis to an exterior 
vacuum solution. Continuing on this line of research, Papapetrou'® (1953) started 
from the Andress-Lewis line element, putting it in a slightly different form, suitable 
for all stationary axisymmetric vacuum solutions: 


ds? = —e(dp” + dz”) — ldd? — 2mdddt + fadt?. (92) 


The functions ,l,m and f depend only on p and z. Papapetrou integrated the 
field equations and found exact stationary rotating vacuum solutions. However, his 
solution carried either mass and no angular momentum or angular momentum and 
no mass. Thus,!48 “this solution is very special and physically of little interest.” 
A year later, a new result was published, which gave the problem of finding 
solutions for a rotating ball a new direction. Petrov!®? (1954), from Kazan, classified 
algebraically the Einstein vacuum field, that is, the Weyl curvature tensor, according 
to its eigenvalues and eigenvectors. This information reached the West, in the time 


°De Sitter had applied it to the Earth—Moon system conceived as a gyroscope precessing around 
the Sun (the rotation of which can be neglected). This effect can nowadays be measured by Lunar 
Laser Ranging, see Will.!89 

PSee Ref. 108, note 8 with the subtitle “Soluzioni binarie di Weyl”. 

4Stationary spacetimes are those that admit a time-like Killing vector. Static spacetimes are 
stationary spacetimes for which this Killing vector is hypersurface orthogonal. Physically this 
implies time reversal invariance and thus the absence of rotation. 
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of the Cold War, with some delay. A bit later, Pirani'®* (1957) developed a related 
formalism. It was the Petrov classification and the picking of a suitable class for the 
gravitational field of an isolated body (Petrov class D, with two double principal 
null directions) that finally led to the discovery of the Kerr solution during 1963, 
ten years after the unphysical solutions of Papapetrou. 

Accordingly, it turned out to be a formidable task to find an exact solution for 
a rotating ball and it was only found nearly half a century after the publication 
of Einstein’s field equation, namely in 1963 by Roy Kerr,9* a New Zealander, who 
worked at the time in Texas within the research group of Alfred Schild. It is a two- 
parameter solution of Einstein’s vacuum field equation with mass M and rotation 
(or angular momentum) parameter a := J/M. 

The story of the discovery of the Kerr solution was told by Kerr himself at a 
conference on the occasion of his 70th birthday.®? A decisive starting point of Kerr’s 
investigations was, as mentioned, the Petrov classification. Melia, in his popular 
book!** “Cracking the Einstein Code”, which does not contain any mathematical 
formula — apart from those appearing in two copies of Kerr’s notes and on a 
blackboard in another figure — has told this fascinating battle for solving Einstein’s 
equation, see also the Kerr story in Ferreira.°® 

Dautcourt*® discussed the work of people who were involved in this search for 
axially symmetric solutions but who were not so fortunate as Kerr. In particular, 
Dautcourt himself got this problem handed over from Papapetrou in 1959 as a 
subject for investigation. He used the results of Papapetrou (1953). Dautcourt’s 
scholarly article is an interesting complement to Melia’s book. In particular, it 
becomes clear that the (Lanczos-Akeley—Andress—Lewis-)Papapetrou line element 
(92) was the correct ansatz for the stationary axially symmetric metric and the 
Kerr metric is a special case therefrom. The Papapetrou approach with the line 
element (92) was later, after Kerr’s discovery, brought to fruition by Ernst®* and 
by Kramer and Neugebauer.!° 


3.2. Approaching the Kerr metric 


We derive a second order partial differential equation, the Ernst equation, that gov- 
erns the stationary axially symmetric metrics in Einstein’s theory. Subsequently, 
we sketch how the Kerr solution emerges as a simple case therefrom. 


3.2.1. Papapetrou line element and vacuum field equation 


In more modern literature, the Papapetrou line element (92), which describes some 
rotation around the axis with p = 0, is usually parametrized as follows": 


ds” = f (dt —wdd)? — f~*[e?"(dp* + dz*) + p?d¢"], 
t € (—o0,00), pe [0,cc), z€(-c,00), € (0,27); (93) 


"See Ernst,°? Buchdahl,?* de Felice and Clarke,®” Quevedo,!® O’Neill,!44 Stephani et al.,}8° 
Eq. (19.21), Griffiths and Podolsky,®9 and Sternberg.!8! 
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we assume f > 0. We compute the vacuum field equation of this metric. Nowadays 
we can do this straightforwardly with the assistance of a computer algebra system. 
During the 1960s, when this work was mainly done, there were no computer algebra 
systems around. Hearn” released the computer algebra system REDUCE in 1968. 
Back then, one had to be in command of huge computer resources in order to bring 
the underlying computer language LISP to work. Today, Reduce can run on every 
laptop, for other computer algebra systems, see Grabmeier et al.° and Wolfram.!?! 

Because of its efficiency, we will use Schriifer’s Reduce-package EXCALC, which 
was built for manipulating expressions within the calculus of exterior forms. For 
that purpose, we reformulate the metric (93) in terms of an orthonormal coframe of 
four one-forms 9° = e;“dx’, with the unknown functions f = f(p,z), w = w(p,z), 
and y = 7(p, z), namely 


9 = f? (dt —wdd) = ei°dx’ = f} (da® — wae), 
ot = frzerdp = ej ida! = fvzedz!, 
v = frzetdz = e;2dx" = fr2etdx?, 


0° = f-2 pdd = e,8dx* = f-3 pdz®. 


94 
95 
96 


( 
( 
( 
(97 


) 
) 
) 
) 


Because of the orthonormality of the coframe 0°, we have 
ds? =9=+0 EP -IQ0-VFeaPr-Ke@s. (98) 
Equations (94)—(98) are equivalent to (93). 
The corresponding computer code, as input for Reduce-Excalc, reads as follows: 


pform f=0, omega=0, gamma=0 $ 
fdomain f=f(rho,z), omega=omega(rho,z), gamma=gamma(rho,z) ; 


coframe o(0) = sqrt(f) * (d t - omega * d phi), 
o(1) = sqrt(f)**(-1) * exp(gamma) * d rho, 
0(2) = sqrt(f)**(-1) * exp(gamma) * d z, 
0(3) = sqrt(f)**(-1) * rho * d phi 
with signature (1,-1,-1,-1); 


Isn’t that simple enough? From this data, the Einstein equation is calculated, with 
the Einstein tensor G“,. The complete, fairly trivial program is documented in 
Appendix. Note in particular that we used a TX interface allowing us to out- 
put the expressions directly in ATpX. This computer output — without changing 
anything of the formulas — after some post-editing for display purposes, reads as 
follows: 


Qo := (A*Opof + fp —5-O,f7 +p? +4-Opf +f pte Oxf «f+ p? 
—5-O.f? -p?—4-Op.py- f? p>? —4-On,27-f? -p? +3- dpw?- ft 
+3+ Ozu" + f°) /(4-e°7 - f +p"), (99) 
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Gy = f-(2-0,f-0,w-p+2-0.f-0.w-p+O,,w-f-p— Ow: f 


Te Oz, 2 ee p)/(2 e771. cae (100) 
Gt = (Of? - p? —O,f? 0? —4- Oy f? -p— yw - f* 
+ O,w? - f*)/(4-6?7 fon). (101) 


Glo = (Opf -Ocf +p? —2-O.7+ f? « p— Opw + O,w + f*)/(2-e?7- fp), (102) 
Ga = (—Opf? +p? —O,f? « p? — 4+ Op, F? «p? — 4+ One f? 
— Apis” » f* — Agus» f*)/(4-€?7 «f+ p?). (103) 
This calculation of the Einstein tensor by machine did not require more than about 
15min, including the programming and the typing in. For sample programs, see 
Socorro et al.'7° and Stauffer et al.17® 
Inspecting these equations, it becomes immediately clear that the numerator 
of (100) does not depend on ¥. In order to get a better overview, we abbreviate 


the partial derivatives of Reduce 0, f by subscripts, fp, and drop the superfluous 
multiplication dots of Reduce. We find 


1 
Cs =0-0= f (or + Wzz — “n a 2(fpWp a fewz). (104) 


Moreover, by subtracting (103) from (99), we find another equation free of 


f* 
7 


Gy — Ga = 0-0-4 (fy | “fo | fe) p— p+ Ew2+e2). (208) 


Left over are Eqs. (101) and (102), which can be resolved with respect to the first 
derivatives of + 


p is 
GHG a | oo ap! A i) t ap we ue) (106) 
1 P 7 
G 2= 0 > V2 = Df Jobe Talia (107) 


Collected, we have these four equations determining the stationary axisymmetric 
vacuum metric 


0= F (foot Apt fx) —B- 2+ Glu} +02), (108) 
0 =f (upp toe stip) +2 fpilp + fas) (109) 
v0 = Gali - 2+ He? 2p) (110) 
12 = batols — Eup (111) 
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Let us underline how effortless — under computer assistance — we arrived at 


these four partial differential equations (PDEs) for determining stationary axially 
symmetric solutions of Einstein’s field equation. 


3.2.2. Ernst equation (1968) 


It is one step ahead, before we arrive at a still more convincing form of these PDEs. 
After some attempts, one recognizes that (109) can be written as 


(En) (Ee), c 


OQ, = —_Wo,; Q, = ge (113) 


Equation (112) is identically fulfilled. We substitute (113) into (108) 


F (Jn ste fur) fe — fF +02 4+02 =0. (114) 


Since (109) is already exploited, we can find 2 by differentiating the ’s in (113) 
with respect to z and p, respectively, and by adding the emergent expressions 
(Wpz = Wzp) 


1 
5204 Ou.) 2fpQ%p — 2f.22 = 0. (115) 


if (Sp | 


Equations (108) and (115) can be put straightforwardly into a vector analytical 
form, if we recall that our functions do not depend on the angle ¢° 


fAf-(Vf)- Vf+t(VQ)- VQ = 0, (116) 
fAQ-2(Vf)-Va=0. (117) 
The last equation can also be written as V - (f~?VQ) = 0. Equations (116) and 


(117) liberate ourselves from the cylindrical coordinates, that is, this expression is 


SIn cylindrical coordinates, we have for a vector V and a scalar s the following formulas, see 
Jackson®4 


1 1 1 
V-V = —8p(0Vi) + 0zV2 + —0gV3, Vs = As = —Op(pdps) + 078 + O38, 
p p p 


1 
p 
1 2 2,1 2 
Vs = e10ps + €2028 + e3—048, Vs: Vs = (Ops)* + (0z8)* 4 = (Ogs)*. 
p p 
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now put in form independent of the specific 3D coordinates. With the potential 
(@=-1) 

Es= fF $10, (118) 
453 


which was found by Ernst®? and Kramer and Neugebauer,!°? we find the Ernst 


equation®? 
(ReE)AE = VE-VE, (119) 
or, in components 
Ee 190 ( 0€ dée\? (aE\? 
(eel + poe (Pap) (ae) *Cae) a 


The “Re” denotes the real part of a complex quantity. Under stationary axial sym- 


metry — the corresponding metric is displayed in (93) — the Ernst equation (119), 
together with Eqs. (118), (113), (110) and (111), are equivalent to the vacuum 
Einstein field equation. 


3.2.3. From Ernst back to Kerr 


This reduces the problem of axial symmetry to the solution of the second order PDE 
(119). This method, which came along only five years after Kerr’s publication, led 
to many new exact solutions, amongst them the Kerr solution (1963) as one of the 
simplest cases. We are only going to sketch how one arrives at the Kerr solution 

eventually. We follow here closely Buchdahl.?* 
One introduces a new complex potential € by 
ao (121) 

E+1 


Then the Ernst equation becomes 
(€€ — LAE = 2EVE- VE, (122) 


where the overline denotes complex conjugation. If one has a solution of this equa- 
tion, we can determine the functions f, w and y by 


ape 
Imf[(€ + 1)?&] Im[(€ + 1)7&5] 
=-2 = 1 Wz =2 =a 124 
Op Emap? Se = 
_ ee = EE, ao. Re(Eoé,) 125 
TOO ape E12 


For rotating bodies, spherical prolate coordinates x,y, with a constant k, are 
much more adapted 


p=kla = 1)2(1 — y?)2, z= kay. (126) 
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It turns out that one simple potential solving the Ernst equation, with the constants 
p and q, is 

€=pxz—iqy with p?+q?=1. (127) 
It leads to the Kerr metric. For this purpose, one has to introduce the redefined 
constants m := k/p (mass) and a := kq/p (angular momentum per mass) and to 
execute subsequently the transformations pa = (p/m) — 1 and qy = (a/m)cos@ 
to the new coordinates p and @. Then one arrives at the Kerr metric in Boyer— 
Lindquist coordinates, which is displayed in the table on the next page. For more 
detail, compare, for instance, the books of Buchdahl,?? Islam,°? Heusler,> Meinel 
et al.!?3 or Griffiths et al.®° By similar techniques, a Kerr solution with a topological 
defect was found by Bergamini et al.‘ 

Incidentally, in the context of the Ernst equation, Geroch made the following 
interesting conjecture: A subset of all stationary axially symmetric vacuum space- 
times, including all of its asymptotically flat members, that is, in particular the 
Kerr solution, can be obtained from Minkowski space by transformations gener- 
ated by an infinite-dimensional Lie group. This conjecture was “proved” by Hauser 
and Ernst,” see also Ref. 75. However, the proof contained a mistake that was 
subsequently corrected in Ref. 77. 

Starting from 4D ellipsoidal coordinates, Dadhich*® gave a heuristic derivation 
of the Kerr metric by requiring, amongst other things, that light propagation should 
be influenced by gravity. 


3.3. Three classical representations of the Kerr metric 


We collected these three classical versions of the Kerr metric in Table 4, see also 
Visser.18° Three more coordinate systems should at least be mentioned: 


e Pretorius and Israel'®*’ double null coordinates: 
Very convenient to tackle the initial value problem 


8 coordinates: 


e Doran* 
Gullstrand—Painlevé like; useful in analog gravity 

e Debever/Plebariski-Demiariski!®> coordinates: 
Components of the metric are rational polynomials; convenient for (computer 


assisted) calculations. 


As input for checking the Kerr solution, we use the orthonormal coframe'®! 


0 = : (dt — asin?6do), (128) 
ot :— ar, 129 

= (129) 
8? := pdb, (130) 


oe = 5 [(r? + a®)do — ade. (131) 
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Table 4. Kerr metric: The three classical representations. 


Kerr—Schild (t, 2, yz) Cartesian background 


T T 
a2 + r2 a? + r2 r 


; 2mr3 ( ; r(adx+ydy) , a(ydx—axdy) | Zaz) 


5 
2 2 2 Pe] 2 od 
ae ae Se" ae (1 =); r=r(z,y, 2) 


xz = (rcos¢+ asin ¢) sind 
y = (rsind — acos ¢) sind 
Zz 


= rcosé 


Boyer—Lindquist (t, 7,0, 0) Schwarzschild like 


a2 = (1 r) dt? anne di dé 
p 


pe 
2 2 in? 
+O dr? + p2d0? 4 (vr ieee — 2) sin26 dé? 
A Pp 
pr =r? +a’ cos?6) A:=r? —I2mr +a? =(r—1r4)\(r—r_) 
2 2 
dv = dt+ Ss dr 
dp =do+ 2a 
= —dr 
- A 
Kerr original (v,7r, 9, ¢) Eddington—Finkelstein like 


2 
ds? = (1 _ (dv — a sin? 6 dy)? 


+2(dv — asin? 6 dy)(dr — asin? 6 dy) + p?(d0? + sin? 6 dy?) 


m+ Vm? — a? 


rp, = m+ Vm? —a? cos?d ry: 


We introduced the sign function, which is convenient for discussing the different 
regions in the Penrose—Carter diagram 


1 forr> <T_, 
em + orr>ry, orr<r (132) 


—1 forr_-<r<ry4. 
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The metric can then be written in terms of the coframe as 
dt =g=cW eat dias)4+PQR74R Os. (133) 


From Table 4 it is not complicated to read off the Schwarzschild and the Lense— 
Thirring metric as special cases. In comparison to the Schwarzschild metric, the 
Kerr solution includes a new parameter a which will be related to the angular 
momentum. However, it should be noted that, by setting a = 0, the Kerr metric 
reduces to the Schwarzschild metric, as it should be (p? — r? and A > r? — 2mr): 


2 
ds? = (1 | dt? 4 2 <tr + 72d0? + r? sin26d¢?. (134) 


By canceling r? in the dr?-term, we immediately recognize the Schwarzschild metric. 
Considering the parameter a we should note the following fact. For small values 
of the parameter a, where we may neglect terms of the order of a?, we arrive at 
(p? > r? and A > r? — 2mr) and 
. 2 
ds? = (1 =m) dt? 4 . dr? +1? (d6? + sin?odg?) — 425? arag. 
r m r 


i gee ani 
7 


(135) 


Since, in spherical coordinates we have ydx — xdy = r? sin” 6dd, the cross-term may 


be rewritten as Amasin™) dtdd — tous (xdz — ydx). Thus, in the limiting case a? < 1, 


the Kerr metric yields the Lense—Thirring metric, provided we identify ma = J,. 


3.4. The concept of a Kerr black hole 


We come back to our Fig. 7 with “Schwarzschild” versus “Kerr”. The Kerr space- 
time may be visualized by a vortex, where the water of the lake spirals toward the 
center. Much of the above said for the Schwarzschild case is still valid. However, one 
important difference occurs. The stationary limit and the event horizon separate, 
which will be illustrated by corresponding graphical representations. 

In case of a vortex, the flow velocity of the in-spiraling water has two compo- 
nents. The radial component which drags the boat toward the center whereas the 
additional angular component forces the boat to circle around the center. Again, 
the stationary limit is defined by the distance at which the boat ultimately can 
withstand the radial and circular drag of the water flow. Beyond the stationary 
limit the situation is not as hopeless as in the Schwarzschild case. Using all its 
power, the boat may brave the inward flow. But then it has not enough power to 
overcome the angular drag and is forced to orbit the center. By means of a clever 
spiral course the boat may even escape beyond the stationary limit. The stationary 
limit is not necessary an event horizon. At some distance, nearer to the center than 
the stationary limit, also the pure radial flow of water will exceed the power of the 
boat. There, inside the stationary limit, is the event horizon. 
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In order to investigate the structure of the Kerr spacetime, we first look at 
“strange behavior” of the metric components in Boyer—Lindquist coordinates. The 
following cases can be distinguished: 


A=0 grr becomes singular, 
p? =2mr gz vanishes, 


p= 0 Grr and ggg vanish, the other components are singular. 


As we have extensively discussed in the previous section, singularities of com- 
ponents of the metric may signify physical effects but, on the other hand, may 
only be due to “defective” coordinates. Thus, we will proceed along similar lines to 
investigate the nature of these singularities. 

We will not address the geodesics of the Kerr metric in detail. For an elementary 
discussion the reader is referred to Frolov and Novikov™ and to the more advanced 


discussion in Hackmann et al.”? 


3.4.1. Depicting Kerr geometry 


We draw a picture of the spatial appearances and relations of the various horizons 
and the singularity of the Kerr metric. From outside to inside these are, explicitly, 


outer ergosurface rey := m+ Vm? — a? cos? 6 


i) joined at polar axis 


event horizon ry :=m+wvVm?—a? 
| merge for @ —> M 
Cauchy horizon = r_ := m — Vm? — a? (136) 
t joined at polar axis 
inner ergosurface rg— := m— Vm? — a? cos? 6 
T ties on the rim for 6 = 4/2 
singularity r= 0. 


For a = 0, inner ergosurface and Cauchy horizon vanish, whereas outer ergo- 
surface and event horizon merge to the Schwarzschild horizon. To visualize the 
various surfaces we use Kerr—Schild quasi-Cartesian coordinates. The radial coor- 
dinate r of the Boyer—Lindquist coordinates is related to the coordinates x, y, z of 
the Kerr—Schild coordinates via, see Table 4 

ety + tee =r+a?, z=rcosé. (137) 


Substituting r= 0, r=r4,r=rp,, and a little bit of algebra yields: 


e Singularity r = 0 
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Since r = 0 leads to z = 0, we get the equation of a circle of radius a in the 
equatorial plane 


ety =a’. (138) 


For a = 0, the ring collapses to the Schwarzschild singularity. 
A closer inspection shows that the structure of the singularity is more 
complex.®:110 


e Horizons r = r+ 


In this case we arrive at the equation for an oblate (for a < m) ellipsoid 


2 2 2 
a. ae esp (139) 
Gy G5 03 
where 
1 
ay = a3 =ri +a” > a3 =. (140) 


e Ergosurfaces r = rz, (9) 
Things are a little bit more involved in this case because r is not constant. We 


can also derive a “ellipsoid-like” equation (for a < m) 


2 2 2 
ede Se (141) 
az(9) —a3(@) a3 (8) 
now with 
1 
ai(8) = a5(0) =rz,(8)+a7, a3(0)= 5 a (142) 
The 9-dependence will deform the ellipsoid. On the equatorial plane with 0 = 7 
we have Thy = 0. Hence, ay = a2 = a and az diverges. This results in a nonregular 


rim on which the ring singularity is located. 

For a > m, the rz, is partly not defined, since the term under the square-root 
changes sign, and becomes negative if 

cos() = (143) 

This defines two rings with 0; = arccos(m/a), 02 = 7 — 0, andr = m. Asa 
consequence, the outer ergosurface only extends to these rings from the outside, 
and the inner ergosurface up to the rings from the inside. The outcome is a kind 
of torus. The center-facing side is constituted by a part of the inner ergosurface 
(along with the ring singularity), whereas the outside facing parts are given by a 
part of the outer ergosurface. 

An extensive discussion of the embedding of the ergosurfaces into Euclidean 
space, together with corresponding Mathematica-programs, can be found in 
Ref. 114. 


The surfaces are visualized in Fig. 12. We did not use a faithful embedding but 
rescaled axes in order to achieve better visibility. 
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a<s<m 


outer ergosurface 


event horizon 


ring singularity 


Cauchy horizon 


inner ergosurface 


ergoregion 


=m 


outer ergosurface 


event horizon 


ring singularity 


inner ergosurface 


ergoregion 


(outer) ergosurface 


ring singularity 


(inner) ergosurface 


Fig. 12. Ergosurfaces, horizons, and singularity for slow, extremal (“critical”), and fast Kerr 
black holes. (For color version, see page I-CP2.) 


The presence of the term m? — a? requires the distinction of three different 
cases dependent on the values of the mass parameter m and the angular momentum 
parameter a: 


m>aw~ slow rotation m=a~ critical rotation m < a~ fast rotation. 
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The slow rotating case shows the richest structure. Both ergosurfaces and both hori- 
zons are present and distinct from each other. As a approaches m, the event and 
the Cauchy horizon draw nearer and nearer. At critical rotation, a = m, both hori- 
zons merge into one single event horizon with r = m. Eventually, for fast rotation 
a >m, the event horizon disappears and reveals the naked ring singularity which 
now is located at the inner edge of the now toroidal shaped (outer) ergosurface. 


3.5. The ergoregion 


We explore the region between the outer ergosurface and the event horizon. There 
it is not possible to stand still, anything has to rotate, even the event horizon. The 
compulsory rotation in the ergoregion allows one to extract energy from the black 
hole. This so-called Penrose process leads to black hole thermodynamics. 


3.5.1. Constrained rotation 


The outer ergosurface, r = Rp, , is defined by the equation goo(rz,) = 0. Thus, it 

is a surface of infinite redshift and a Killing horizon. For a third characterization 

of the ergosurfaces we have to deal not only with radial but also with rotational 

motion. Consider the Kerr metric in Boyer—Lindquist coordinates with dr = d? = 0 
ds? = gi.dt? + 2grgdtdd + 94640", 


or, after dividing by dt?, 


ds)” oqo 44,, (4) 
Ge pe ge ee | ge)” 


The explicit form of the metric components is not needed here. Note that Q = he 


is the angular velocity with respect to a distant observer 


ds\? 
(=) = get + 2912 + 94”. (144) 


The worldline of a particle has to be timelike, ds? < 0. Since the last equation is 
quadratic in Q, this is only possible between the roots ds? = 0, 


2 
Q _  Gto _ (#4) tt 
min/max *— +r : 
Ioe Ioe 


What does 


Qmin << OQmax 


mean? In flat. Minkowski spacetime (with Cartesian coordinates), Qmin/max = +1 
implies that a particle, e.g. may freely circle around a point, restricted only by 
the condition |v| = |r -Q| < c. In the Kerr spacetime, at r = rg,, the smallest 
possible value of Q becomes 0. The particle may just stay at rest, but can rotate 
only in one direction, namely in direction of the angular momentum of the black 
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hole. Beyond rg, Q is forced to be larger than zero: The particle must co-rotate 
with the black hole. 

The preceding statement is only correct for radially infalling particles. In gen- 
eral, the influence of the rotating black hole on the motion of particles is more 


complex.!® 


3.5.2. Rotation of the event horizon 


The behavior of N+ on the event horizon is quite remarkable. By using the identities 


2mrz=rit+a’®, p?—1r? =a? cos’ 6 =a? —a’sin’ 6, (145) 


one finds for the event horizon (r = r+), 


a 


0, = 0 =O = (146) 
To interpret this result we use (144) and write 
Gull’ \por, =0, with lM = (1,0,0, Qn). (147) 
The integral lines of [4 = ¢¥, 
a! = (t,r4, 00, Qxt), (148) 


define a lightlike hypersurface rotating with a uniform angular velocity: The event 
horizon of a Kerr black hole rotates “rigidly” with Qy, see in this context Frolov 
and Frolov.®* A consequence of this finding is discussed in the next paragraph. 


3.5.3. Penrose process and black hole thermodynamics 


The (outer) ergosurface is a Killing horizon, not an event horizon. It is possible for 
particles to pass from the inside to the outside. This allows for a peculiar scenario: 
Since inside the Killing horizon the particle is forced to spin around, it picks up 
an additional rotational energy. This energy can be partly extracted by means of 
the Penrose process. An infalling particle traverses the Killing horizon, picks up 
rotational energy and subsequently decays into two parts. If one part plunges into 
the event horizon, the other part, carrying away some of the rotational energy, can 
return to the outside of the Killing horizon. Thus, the region between Killing and 
event horizon is justly labeled as “ergoregion” (from Greek ergon = work). 

The observation that energy can also be extracted from the black hole gave rise 
to black hole thermodynamics. The next question is then how the parameters change 
if the black hole is infinitesimally disturbed. It was Bekenstein!? who established a 
relation between the variations of the mass, the angular momentum, and the area 
of the event horizon. Using the coframe (186)—(190), for A = 0, we find for the area 
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of the event horizon 


A= / PAP = / sin O(r? + a”)d0 A dd = 4n(r7, +a). (149) 
TST. TST y 
t=const. t=const. 


We can rewrite (149), using (145) and J = ma, 
A = 8rmr4 = 8n(m? — m4 — J?). (150) 


The differential of this equation is 


OA OA 80 80 
dA = at aye mar dm . QydJ, (151) 
with 
1 vm — J? J 


Qn (152) 


k= , =k —————. 

2m m2 + /mt — J? J/mét — F2 
The parameter Qy is the angular velocity of the horizon (146). The parameter « is 
the surface gravity. Equation (151) can be rewritten as 


dm = dA +OndJ. (153) 
8a 


The infinitesimal change of the mass, dm, is proportional to the infinitesimal change 
of the energy, dE. The term QydJ describes the infinitesimal change of the rota- 
tional energy. This suggests the identification of (153) with the first law of thermo- 
dynamics. The analogy is still more compelling by observing that, for a given black 
hole of initial (or irreducible) mass m, the area of the horizon is always increasing. 
Even by exercising a Penrose process, which extracts rotational energy from the 
black hole, a fragment of the incoming particle will fall into the black hole thereby 
increasing its mass and, in turn, the area of the horizon. Accordingly, the area A 
of the horizon behaves formally as if it is proportional to an entropy S and the 
surface gravity « as if it is proportional to a temperature T. In fact, the Hawking 
temperature and the Bekenstein—Hawking entropy turn out to be 
= B KOSS 2 
2rkp 4Gh 
with kg as the Boltzmann constant. Equation (153) together with its thermody- 
namical interpretation (154) can be considerably generalized thereby establishing 
the new discipline of “black hole thermodynamics”, see Heusler®° and Carlip.?° 


A, (154) 


3.6. Beyond the horizons 


In the Schwarzschild spacetime, event horizon and Killing horizon coincide. In the 
Kerr spacetime, form >a, there is an outer Killing horizon, an event horizon, an 
inner Killing horizon and an inner horizon. So far, all the coordinate systems we 
used for the Kerr metric show singularities at the outer and inner horizons r = r+. 


The construction of a regular coordinate system is possible along the same lines as 
for the Schwarzschild metric. Of course, the corresponding calculations are much 
more involved for the Kerr case. Therefore, we will give more a kind of heuristic 
approach to motivate Kruskal-like coordinates for the Kerr metric. 
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3.6.1. Using light rays as coordinate lines 


Our first task is to construct Eddington—Finkelstein like coordinates for the Kerr 
metric by considering radial light rays. We restrict ourselves to the case 0 = 0 = @. 
The Kerr metric in Boyer—Lindquist coordinates reduces to (9 = 0 > p? = r?+a? = 
A + 2mr): 


A p? 
2 2 2 
ds age + —dr*. 


Hence, for in-/out-going light rays, ds? = 0, we find 


2 2) 72 
Eee aie Nee ig ae 
AY = * Garr) 
or, explicitly, 
2 i 72 2 2 
t= f ar a. =r4 pot Injr — ry 
(r—r_)(r—r4) a 
2 2 
a In|r — r_| + const. (155) 
4. == 


Unlike in the Schwarzschild spacetime, there form two event horizons, at r = r_ 
and r = rz, respectively. However, as a — 0, r_ goes to 0, whereas r+ approaches 
2m and the Schwarzschild situation is reproduced. 

We next focus on the (Boyer—Lindquist) coordinates (t, 7) and how the horizons 
etc. will appear in terms of the new coordinates. The other coordinates and the 
regularity of the metric is not addressed. However, all the details can be found in 
the literature, see Refs. 20, 30 and 78. Using (155) analogously to (42), we introduce 
Eddington—Finkelstein like coordinates for Kerr, 


v:=t+trt+o4ln[r—ry|—o_In|jr-r_|, (156) 


u:=t—r—o4ln|r—ry|+o_ln|r—-r_], (157) 


where (according to the notation in Ref. 20) 


rt +a? mre 
i — — — = . 1 
as (a m2 — a? ite) 


Again, we can get rid of the coordinate singularity by rescaling u and v analogously 
to (48). Since we have two horizons, r = rz and r = r_, we have to decide with 
respect to which singularity we rescale. We first choose r, and define, see (48), 


1 
- 2 Prt 
O1= exp( Z ) = ie Bal ae A (159) 


o it 
_— exp( e ) = a ad (160) 
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with 
C4. ee 
Again, we go back to time- and space-like coordinates, exactly like in (50) 


f= 5 +a), ee me (162) 


Then we work out the four coordinate patches exactly like (55)—(60). We arrive at a 
Kruskal like coordinate system. However, there arises an important difference: The 
coordinate system still is singular for r = r_. This can be most easily seen from 
the analog to (59), the inverse transformation to r, which now reads 


' oe eo pas 


P—P?= pe (163) 


vu = 


The horizon r = r is regular in this coordinate system and is described by * = +t. 
The transformation(s) are valid in the domain r_ < r < +00 


Per, weak as for Schwarzschild 
fs eg =F —» Lee particularly *# — +oo for t = 0 
ror. 1PaoP S00 particularly t + +oo for * = 0 


r=TE, :7? — 1? =const. > 0 hyperbolas in I, II patches. 


In contrast to the Schwarzschild case, the full upper and lower halfplanes of the 
(7,t) plane is covered. It is not limited by the hyperbolas of the Schwarzschild 
singularity r = 0! 


We can regularize with respect to r_ by introducing 
v u 

7) — _— I — —- 5 I A 

0 exp( ~), u ex (5) (164) 


7e 7. (165) 


Now we find 


This coordinate system covers the domain —oo < r < rz. Like the first coordinate 
system, it contains also the region between the horizons, r_ < r < rz. This time, 


r > rx is excluded. 


r=r :f-=Ht as above 
rf —00:f? —P > 0 particularly t > -too for * = 0 
ror, :7—P 5-00 particularly 7 — +oo for t = 0 


r=Trp_ :7° —t? =const.>0 hyperbola in I*, II* patches 


Fa 17=-P= <e <0O hyperbola in I*, II* patches. 
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Again, the whole (7,¢) plane is covered. Note that the spacetime extends beyond 
the ring(!) singularity. 


3.7. Penrose—Carter diagram and Cauchy horizon 


We compactify the Kruskal-like coordinate system for Kerr, yielding conformal 
Penrose—Carter diagrams. We discuss the analytical extension and the role of the 
inner horizon as Cauchy horizon. 

In order to draw a Penrose—Carter diagram for the Kerr spacetime, we compact- 
ify the coordinates via the tangent function like in Sec. 2.6. The result looks at first 
quite similar to Schwarzschild in Fig. 11. However, the cut-off at r = 0 vanishes. 
Figures 13 and 14 both show the entire compactified (7, t)-space. 

The two coordinate sets overlap in the region between the horizons. Thus, the 
corresponding coordinate patches have to be identified. And we can even draw 
beyond that Patch II is identified with patch IV*, II* with another patch IV**. And 
so on: We find an infinite sequence of coordinate systems. Formally, this constitutes 
a maximal analytic extension of the Kerr spacetime. Alas, there are good reasons 
for not believing in such vast an extension. 

The Kerr metric is a vacuum solution of Einstein’s field equation — it describes 
a totally empty spacetime. To render it physically meaningful, we should regard it as 
the spacetime structure generated by a sensible physical source. One may ask then, 
why a single source should produce an infinite number of spacetimes. And it is even 
worse. The regions beyond the Cauchy horizon are exceptionally badly behaved. 


Fig. 13. The Penrose diagram for the Kerr spacetime for r > r_. 
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ring singularity r=0 
ring singularity r=0 


Fig. 14. The Penrose diagram for the Kerr spacetime for r < r_. 


Consider the Cauchy surface in regions I+II of Fig. 15. All light rays and particle 
trajectories from the past intersect this surface only once. Then the field equations 
will tell us their future development, see Franzen,®? for example. In Fig. 15, this 
is roughly indicated by the little light cone. However, even total knowledge of the 
world in I+II does not determine what might be going on in regions I*+II*. That 
is why r = r_ is called a Cauchy horizon, see Fig. 16. Thus, I* and II* are not only 
beyond the Cauchy horizon but also beyond predictable, sound physics. Moreover, 
the zigzagged region beyond the singularity is physically doubtful. In this region, the 
asymptotics is reversed, see the permutation of J* and I~. As a consequence, the 
asymptotic mass in I* picks up a minus sign as compared to I. So the same source 
possesses a positive mass +m in I and a negative mass —m in I*, which seems 
strange. Moreover, it turns out that these regions are crowded with closed timelike 
curves. The whole extension is not globally hyperbolic. Thus one should restrict to 
the “diamond of sound physics”, I+H-+III+IV. To do this consistently, one has to 
devise a physical mechanism preventing traveling beyond the Cauchy horizon, that 
is, the Cauchy horizon should become singular in some sense (cosmic censorship, 
see Penrose?*®). 


3.8. Gravitoelectromagnetism, multipole moments 


The curvature tensor of the Kerr metric is calculated. By squaring it suitably, we 
find the two quadratic curvature invariants. Subsequently, we determine the gravi- 
toelectric and the gravitomagnetic multipole moments of the Kerr metric, and we 
mention the Simon—Mars tensor the vanishing of which leads to the Kerr metric. 
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ics? } 


Fig. 15. Maximal analytic extension of the Kerr spacetime. (For color version, see page I-CP3.) 
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T° ji; 
so Cauchy surface S a 


Fig. 16. Cauchy horizon: The causal past of a point P outside the Cauchy horizon of S is entirely 
determined by the information given on the Cauchy surface 5S. A point Q inside the Cauchy 
horizon receives also information from Z~ . Evidently, initial data on S are not sufficient to uniquely 
determine events at point Q. The surface that separates the two regions “causally determined by 
S” and “not causally determined by S” is called Cauchy horizon. 


The analogy between gravity and electrostatics became apparent when the 
Coulomb law was discovered in 1785. The gravitational and the electrostatic forces 
both obeyed an inverse-square law, with the difference that the mass can only be 
positive whereas the electric charge exists with both signs. Equal electric charges 
repel, opposite ones attract; in contrast, gravity is always attractive. 

In 1820, electromagnetism was discovered by Oersted, and the emerging unified 
theory, called “electrodynamics” by Ampere, eventually found its expression in the 
Maxwell equations of 1864. Besides the electric field EF related to charge, we have the 
magnetic field B related to moving charge. These fields, together with the electric 
and magnetic excitations D and H, respectively, obey the Maxwell equations. 

Newton’s gravitational theory was only superseded in 1915/16 by Einstein’s 
gravitational theory, general relativity. However, already in the 1870s physicists 
began to speculate whether, besides Newton’s “gravitoelectric” field, related to 
mass at rest, there may also exist a new “gravitomagnetic” field, accompanying 
moving mass; for more details and references see Mashhoon.!!® As we saw above, 
these speculations became a solid basis in general relativity. In (90), the gravitomag- 
netic Lense—-Thirring term surfaced, which found solid experimental verification in 
the meantime. Thus, we can speak with justification of gravitoelectromagnetism!!® 
(GEM), a notion which can guide our intuition, see in this context also Ni and 


Zimmermann.!3" 


3.8.1. Gravitoelectromagnetic field strength 


Electrodynamics is a linear theory, GR a nonlinear one. Still, if we take a lin- 
earized version of GR, there are those strong analogies between electrodynamics and 
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gravitodynamics, as worked out, for instance, nicely in Rindler’s!®° book. However, 
the analogies go even further, as pointed out particularly by Mashhoon.'!® Even in 
an arbitrary gravitational field, if referred to a Fermi propagated reference frame 
with coordinates (T,X), GEM is a useful concept. If we apply the geodesic deviation 
equation (33) to such a frame, the gravitoelectromagnetic field strength, represent- 
ing the tidal forces, turns out to be!!6+t 


GEMF43 = —Rapoi(T)X’. (166) 
If we develop (33) up to the order linear in the velocity V := dX/dT, we find 
ax 
dT? 
Now we recall that in electrodynamics the electric and the magnetic fields E and 


B, respectively, are accommodated in the 4D electromagnetic field strength tensor 
according to 


= Roop X? + 2a lV Sy HOO. (167) 


0 -E, -E, —Es 
> 0 Bs; —Bo 

(Fos)=1 5g 0 B = —(Fea). (168) 
© © © 0 


The diamond symbol © denotes matrix elements already known because of the 
antisymmetry of the matrix involved. The corresponding two-form reads F = 
$Fugdx® \ dx’. Keeping (168) in mind, Eq. (167) can be rewritten as a vector 
equation 
ax 
dT? 


In accordance with the equivalence principle, this equation of motion is independent 


= —"B —2Vx*'B. (169) 


of the mass. The analogy with electromagnetism requires that the gravitoelectric 
charge, in terms of the mass m, is —1 and the gravitomagnetic charge —2. In elec- 
trodynamics, both quantities are +1. The difference comes from the vector nature 
of the electromagnetic potential A, as compared to the tensor nature of the grav- 
itational potential gag, that is, helicity 1 as compared to helicity 2. The relation 
between the gravitomagnetic to the gravitoelectic charge, that is, the gyrogravito- 
magnetic ratio, is two: 8'y = 2. Note that in Gravity Probe-B the authors specify 
the gyrogravitomagnetic ratio as 1. However, their gyros carried only orbital angu- 
lar momentum rather than spin angular momentum. Hence, this is to be expected. 
For more detailed discussions on this difference, see Refs. 80 and 138. 


* Alternatively, we could generalize the Newtonian tidal force matrix of (9) to the gravitoelectric 
and gravitomagnetic tidal force matrices, Ej; = Riojo and Bij = ein Re1j0, respectively, see 
Scheel and Thorne.!®9 Both matrices are symmetric and trace-free. Note that CEMF, 
antisymmetric 4 x 4 matrix and € and 6 are both symmetric trace-free 3 x 3 matrices. 


g is an 
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It has been pointed out by Ni!* that the “measurement of the gyrogravitational 
ratio of [a] particle would be a further step!88 toward probing the microscopic origin 
of gravity. GP-B serves as a starting point for the measurement of the gyrogravi- 
tational factor of particles.” 


3.8.2. Quadratic invariants 


In electrodynamics, we have two quadratic invariants®!: 


1 1 
5 oP? =4GF AF)=B? = &*, 5 Foe Ft? =*(F A F)=2E-B,_ (170) 
where we used for the tensor dual the notation F*°? := $2°°7°F)5. We also 


employed the very concise notation of exterior calculus with the Hodge star 
operator." 

The first invariant is proportional to the Maxwell vacuum Lagrangian and is 
an ordinary scalar, whereas the second one corresponds to a surface term and is a 
pseudoscalar (negative parity). 

Turn now directly to the Kerr metric and list for this example the tidal 
gravitational forces, which are represented by the curvature tensor. With its 20 
independent components, it can be represented by a trace-free symmetric 6 x 6 
matrix, see (32). The collective indices A, B,... = 1,...,6 are defined as follows: 
{é?, £0, £6; 60, or, FO} — {1,2,3,4,5,6}. We throw the orthonormal Kerr coframe 
(128) to (133) into our computer and out pops the 6 x 6 curvature matrix 


—2E 0 0 2B 0 0 


(Ras) ~ o o QE O 0) ~ (Raa), ee) 
° (oe) fe) —IK 0 
fo) ore) fo) fo) —E 
with 
r? — 3a? cos? 0 3r? — a? cos? 0 
— oo B= d——__.—__._.... 172 
Te (r2 + a2 cos? 0)8 ’ EOS (r2 + a2 cos2 0)8 ( 7 ) 


It is straightforward to identify E as the gravitoelectric and B as the gravitomagnetic 
component of the curvature. This is in accordance with (166). 


“The Hodge star «w of a p-form w = (1/p!)wyy---pp»de"t A--- Ada"? is an (n—p)-form xw, with the 
components (*W) 1. +n —p» = (L/plevt YP py pn pri vp, Where é is the totally antisymmetric 
unit tensor and n the dimension of the space, see Eq. (C.2.90) in Ref. 81. 
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It is obvious how we should continue. Our gravitoelectromagnetic invariants 
will be” 


1 P 

K:= 5 RaswR™ = — x (Rag A R*), (173) 
1 

P= gee Rap Re = «(Rag \ EP”), (174) 


Again, our program determines the Kretschmann™ scalar K and the Chern- 
Pontryagin pseudoscalar P to be!® 


K = —24(B? — E?), P=—48 


iS] 
os 


(175) 


The similarity to (170) is impressive. The GEM analogy quite apparently applies 
to the full nonlinear theory. The results in (175), partly in more involved represen- 
tations, can be found in the literature, see, for instance, the books of de Felice and 
Clarke?” and of Ciufolini and Wheeler,*° but compare also de Felice and Bradley,*! 
Henry,®4 and Cherubini et al.°? 

Thus, the quadratic invariants K and P confirm that the Kerr metric is the exte- 
rior field of a rotating mass distribution. In order to get more information about 
this distribution, we proceed, like in electrodynamics, and look into the gravitoelec- 
tromagnetic multipole moments of this rotating mass. 


3.8.3. Gravitomagnetic clock effect of Mashhoon, Cohen et al. 


According to the results of Lense—Thirring, the rotation of the Sun changes the 
spacetime around it by inducing gravitomagnetic effects. As we saw above, in a 
similar way the temporal structure around a Kerr metric is affected by the angular 
momentum of the Kerr source. Thus, a gravitomagnetic clock effect should emerge,* 
the measurability of which requires very accurate clocks. The effect can be demon- 
strated by two clocks that move on equatorial orbits, one in prograde and the other 
in retrograde orbit around the Kerr metric. It turns out!!” that the prograde equa- 
torial clock is slower than the retrograde one. This is not necessarily what our 
intuition would tell us. It is connected with the fact that the dragging of frames in 
a Kerr metric can sometimes turn out to be an “antidragging”, thus making this 


165 


notion less intuitive,*°? as we already recognized in Sec. 3.5. 


‘In exterior calculus, we have the Euler four-form E := Rag A *R°?, with K = +E. Analogously, 
we have the Chern—Pontryagin four-form P := —R® aNRP a, Which is an exact form, with P := xP, 
see e.g. Obukhov et al.!44 

“Usually in the literature,?4? the Kretschmann scalar is defined as Ropys R°PR%, even though 
the electrodynamics analogy would suggest to include the factor 1/2. 

*This was first predicted by Cohen and Mashhoon®” and worked out in greater detail by Mash- 
hoon et al.,!1%118 see also Bonnor and Steadman!® and the review papers in the workshop of 
Lammerzahl et al.!°? In a similar way, there emerges also a gravitomagnetic time delay, see Ciu- 
folini et al.33 


Schwarzschild and Kerr solutions of Einstein’s field equation 1-167 


Generalizations of this clock effect were studied, for example, by Hackmann and 
Lammerzahl.” The recent discussion of the Clocks around Sgr A*, by Angélil and 
Saha’ is, in effect, just one more manifestation of the gravitomagnetic clock effect. 


3.8.4. Multipole moments: Gravitoelectric and gravitomagnetic ones 


In Newton’s theory, one gets a good idea about a mass distribution and its gravita- 
tional field by determining the multipole moments of the mass distribution M. In 
GR, because of the existence gravitomagnetism, we have to expect a new type of 
multipole moments, namely the moments J of the angular momentum distribution. 

If a stationary axially symmetric line element of the form (93) is asymptotically 
flat, then it is possible!®° to define two sets of multipole moments, the gravitoelectric 
moments M, (“mass multipole moments”) and the gravitomagnetic moments J, 
(“angular momentum multipole moments”), for s = 0,1,2,.... These moments 
were found by Geroch® for the static and by Hansen” for the stationary case. 
They were reviewed by Quevedo!® 
by Quevedo and Mashhoon.!°*:!6! Hansen computed the multipole moments for the 
Kerr solution and found 


and used for constructing new exact solutions 


s=0 Mp = -—m J, =ma (176) 
s=1 Mp2 = ma? Jg=—ma? (177) 
s=2 Ms=—ma* Js =ma (178) 
s8=8... Me=ma®... J7=—ma’. (179) 
More compactly, we have 
Mp, = (—1)*t*ma**, Mos+1 = 0; (180) 
Jog =0, Joey. = (—1)*ma?*t?, (181) 


It is possible to introduce normalized multipole moments, see Meinel et al.,!?% 
such that for Kerr we have M, +43, = m(ia)*. Then the mass monopole Mo — 
m is positive. Apparently, the Kerr metric has a simple multipolar structure or, 
formulated differently, only very specific matter distributions can represent the 
interior of the Kerr metric. 

Quevedo!° compiled a number of theorems which illustrate the use of the mul- 
tipole moments: 


(i) A stationary spacetime is static if and only if all its gravitomagnetic multipole 
moments vanish (Xanthopoulos, 1979). 
(ii) A static metric is flat if and only if all its gravitoelectric multipole moments 
vanish (Xanthopoulos, 1979). 
(iii) A stationary metric is axisymmetric if and only if all its multipole moments 
are axisymmetric (Giirsel, 1983). 
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(iv) Two metrics with the same multipole moments have the same geometry at 
large distances from the source (Beig and Simon, 1981; Kundu, 1981; Van den 
Bergh and Wils, 1985). 

(v) Any stationary, axisymmetric, asymptotically flat solution of Einstein’s vac- 
uum equation approaches the Kerr solution asymptotically (Beig and Simon, 
1980). 

(vi) Any static, axisymmetric, asymptotically flat vacuum solution approaches the 
Schwarzschild solution asymptotically (Beig, 1980). 

In the formulation of Stephani et al.!®° 

(vii) A given asymptotically flat stationary vacuum spacetime is uniquely charac- 

terized by its multiple moments. 


We recognize that the knowledge of the multipole moments provides a lot of insight 
into the physical properties of an exact solution. 

From the point of view of the Kerr solution Theorem (v), see Beig and Simon,'? 
is perhaps the most interesting one. It underlines the central importance of the 
Kerr solution. The considerations in the context of Theorem (v) were further devel- 
oped by Simon.!’4:!7° On the three-dimensional spatial slices of a stationary axially 
symmetric metric, he defined the 3D “Simon tensor”, a kind of complexified gen- 
eralized Cotton-Bach tensor.®° The vanishing of the Simon tensor then leads to the 
multipole moments of the Kerr solution. Later, Mars,!!? aS 
and Senovilla,!!° generalized this approach and was led to the 4D “Simon-Mars 
tensor”. In Ionescu and Klainerman,®® 
the Simon-Mars tensor, see also Wong.!9? More recently, Backdahl and Valiente 
Kroon? have proposed replacing the Simon—Mars tensor by another measure of 
“non-Kerrness”, namely a scalar parameter. 


see also Mars’** and Mars 


one can find a more extended discussion of 


3.9. Adding electric charge and the cosmological constant: 
Kerr—Newman metric 


Enriching the Kerr metric by an electric charge is straightforwardly possible. We 
start from the metric (133) with coframe (128) to (132). This coframe can accom- 
modate the Kerr, the Schwarzschild, and the Reissner—Nordstr6m solutions. The 
different forms of the function A suggest how a charged Kerr solution should look 
like 


Schwarzschild —_(m) a A=r? —2mr 
Reissner—-Nord. (m, q) p=" A=r?-—2mr +4 

Kerr (m, a) p=r’+a’*cos?@ A=r?—2mr +a? 
Kerr-Newman (m,a,q) p=r?+a?cos?@6 A=r*?-—2mr +4 a? 


Charging the Schwarzschild solution is achieved by adding gq? to the function A. 
Since the charged Kerr solution should encompass the Reissner—Nordstr6m solution, 
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we tentatively keep the term q? for the case a 4 0. Now, we can indeed find a 
potential 


qr : 
A= ee — asin’ 6d), (182) 


such that the Einstein-Maxwell equations are fulfilled. The potential describes a 
line-like charge distribution at p = 0, that is, on the ring singularity of the Kerr 
spacetime, which is quite satisfying.'°° This charged Kerr solution was first worked 
out by Newman et al.'°* (1965), using “methods which transcend logic”, as Ernst®* 
puts it. He, in turn, proceeded from (120). Replacing’ € by \/1 — qq*E generates a 
solution of the Einstein—Maxwell equations with potential A; + i1Ag = q/(€ +1). 

The Kerr and the Kerr-Newman solution behave quite similarly. We can adopt 
most of the discussion of the Kerr metric by substituting a? + q? for a?. 

We can further generalize the Kerr-Newman metric to include also a cosmolog- 
ical constant, see Sec. 4.1, and even more parameters, see Fig. 17. 
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Fig. 17. Schematics of Petrov D solutions: The spherically symmetric Schwarzschild solution 
with mass parameter m is located in the center. Adding an electric charge q brings us to the 
Reissner—Nordstr6m solution. It is still spherically symmetric, but adds a second horizon. The 
distance between the horizons increases with the charge q. Setting the black hole into rotation, 
the angular momentum parameter emerges, a # 0, and reduces the spherical symmetry to an 
axial one. An oblate ergosurface (two, actually) forms. Event horizon and ergosurface meet at the 
polar axis, the equatorial distance increases with a. All these solutions can be deSittered, that is, a 
cosmological constant A is added. All presented solutions are subcases of the Plebanski-Demianski 
solution, which adds three more parameters.°? 


YHere, q is not the charge but a complex parameter in the solution of the Ernst equation. 
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3.10. On the uniqueness of the Kerr black hole 


The Kerr black hole, up to some technical assumptions, is the unique solution for 
the stationary, axially symmetric case. We point to some of the literature where 
these results can be found. 

Because of the Birkhoff theorem, the Schwarzschild solution (mass parameter m) 
represents the general spherically symmetric solution of the Einstein vacuum field 
equation. The analogous is true in the Einstein—Maxwell case for the two parameter 
Reissner—Nordstr6m solution (mass and charge parameters m and q, respectively). 
Thus, for spherical symmetry, we have a fairly simple situation. 

In contrast, in the axially symmetric case, there does not exist a generalized 
Birkhoff theorem. The two-parameter Kerr solution (mass and rotation parameters 
m and a, respectively), is just a particular solution for the axially symmetric case. 
As we saw in Sec. 3.8, the Kerr solution has very simple gravitoelectric and grav- 
itomagnetic multipole moments (180, 181). Numerous solutions are known that 
represent the exterior of matter distributions with different multipole moments. 
The analogous is valid for the three parameter Kerr-Newman solution (parameters 
m, a,q), see Stephani et al.8° and Griffiths and Podolsky.® 

However, one can show under quite general conditions that the Kerr-Newman 
metric represents the most general asymptotically flat, stationary electro-vacuum 
black hole solution (“no-hair theorem”), see Meinel’s short review.!2? Important 
contributions to the subject of black hole uniqueness were originally made by 
Israel,°°-9! Carter,?°3! Hawking and Ellis,”® Robinson,!®°1®? and Mazur!?° (1967— 
1982), for details see the recent review of Chruéciel et al.?® 

More recently Neugebauer and Meinel!®°:!53 found a constructive method for 
proving the uniqueness theorem for the Kerr black hole metric. This was extended to 
the Kerr-Newman case by Meinel.!?! By inverse scattering techniques, they showed 
how one can construct the Ernst potential of the Kerr(-Newman) solution amongst 
the asymptotically flat, stationary, and axially symmetric (electro—)vacuum space- 
times surrounding a connected Killing horizon. 

Let us then eventually pose the following questions?”: 

(i) Are axially symmetric, stationary vacuum solutions outside some matter dis- 
tribution “Kerr”? The answer is “certainly not”, and it makes sense to figure 
out ways to characterize the Kerr metric, see Sec. 3.8. 

(ii) Is the Kerr solution the unique axially symmetric, stationary vacuum black 
hole? The answer is essentially “yes” (modulo some technical issues) — see, for 


example Mazur.'?° 


The general tendency in the recent development of the subject is to use addi- 
tional scalar” or other matter fields. They weaken the uniqueness theorems, which 
is probably not too surprising. 


“Recently, Herdeiro and Radu took as source for the Einstein field equation a massive complex 
scalar field (without self-interaction). They found numerically a generalization of the Kerr solution, 


which may be of some relevance to astrophysics.!93:194 
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Let us conclude with a quotation that may make you curious to learn still more 
about the beauty of the Kerr metric: We have many different axially symmetric 
solutions. The Kerr solution is characterized by “stationary, axially symmetric, 
asymptotically flat, Petrov type D vacuum solution of the vanishing of the Simon 
tensor, admitting a rank-2 Killing-Stackel (KS) tensor of Segre type [(11)(11)] con- 
structed from a (nondegenerate) rank-2 Killing-Yano (KY) tensor”, see Hinoui 
et alt 


3.11. On interior solutions with material sources 


To match the Kerr (vacuum) metric to a material source consistently is one of the 
big unsolved problems. Only the rotating disc solution of Neugebauer and Meinel 
provides some hope. 

This section is added in order to draw your attention to an unsolved problem, to 
the solution of which you might want to contribute. Find a realistic material source 
for the Kerr metric in the sense of an exact solution. Many unsuccessful attempts 
have been made, see the early review of Krasiriski!®! of 1978. More recently, in 
2006, Krasiriski!®® concludes “that a bright new idea is needed, as opposed to rou- 
tine standard tricks tested so far.” This statement was not made lightheartedly, 
Krasinski knows what he is talking about. 

Many axially symmetric vacuum solutions were constructed. Quevedo and 
Mashhoon,'®! for example, deformed the multipole moments of the Kerr(-Newman) 
metric and constructed appropriate solutions of the Einstein(-Maxwell) equation 
that describe the exterior gravitational field of a (charged) rotating mass. It is 
always the hope that somebody may find a suitable matter distribution with the 
multipole moments of the Kerr solution — but this did not happen so far. For 
another approach see Marsh.!!® 

We are only aware of one exact solution that fits into this general context: 
It is the infinitesimally thin and rigidly rotating dust solution of Neugebauer and 
Meinel!3!;12 (1993). It is an exact analytical solution of the Einstein equation with 
matter. It depends on two independent parameters, the radius fo of the disk and its 
angular velocity 9. Petroff and Meinel!*! 
cedure, a post-Newtonian approximation of the solution that helps to understand 
the Newtonian limit. 

We recall that in electrostatics in flat space, for example, we prescribe an elec- 
tric charge distribution and we are used to solve the corresponding boundary value 
problem within Maxwell’s theory. Similarly, Neugebauer and Meinel specified a very 
thin rotating disk of dust and solved the boundary value problem within GR. This is 
a well-defined procedure. The problem is, however, that within a nonlinear theory, 
such as GR, it is extremely hard to implement. Remarkably, for certain parameter 
values, the gravitational field of the disk approach the extremal Kerr case. Accord- 
ingly, there exists a certain relation to the Kerr problem. The desideratum would 
be to find a rotating matter distribution the external field of which coincides with 
the complete Kerr field. 


developed, by means of an iterative pro- 
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Driven by the fact that the electrically charged Kerr solution, the Kerr-Newman 


solution, has a g-factor of 2, exactly like the electron (see also Pfister and King!?), 


24,25 sneculated that a soliton like solution of the Dirac equation may be 


Burinskii 
the source of the Kerr metric, see also Burinskii and Kerr.?° Is that the “bright 
new idea” Krasiriski was talking about? We do not know but a hard check of the 


Burinskii ansatz seems worthwhile. 


4. Kerr Beyond Einstein 


In generalizations of Einstein's theory of gravity, the Riemannian geometry of space- 
time is often extended to a more general geometrical framework. We describe two 
such examples in which the Kerr metric still plays a vital role. 


4.1. Kerr metric accompanied by a propagating linear connection 


We display the Kerr metric with cosmological constant that, together with an explic- 
itly specified torsion, represents an exact vacuum solution of the two field equations 
of the Poincaré gauge theory of gravity with quadratic Lagrangian. 

In gauge theories of gravitation, see Blagojevié et al.,!” the linear connection 
becomes a field that is at least partially independent from the metric. It can be 
either metric-compatible, then it is a connection with values in the Lie-algebra of 
the Lorentz group SO(1,3) and the geometry is called a Riemann—Cartan geometry, 
or it can be totally independent, then it resides in a so-called metric-affine space 
and the connection is GL(4, R)-valued. For simplicity, we concentrate here on the 
former case, the Poincaré gauge theory of gravity, but the latter case is also treated 
in the literature.11187 

Let us shortly sketch the theory. Gauging the Poincaré group leads to a space- 
time with torsion T® and curvature R°? (Riemann—Cartan geometry**): 


1 : : 
P= DO Sadi 4 TP Av = aris da! \ da! , (183) 


1 ; ; 
Ree — gre? — Tet ATP = — RP = 5 Rij da" A da’. (184) 


Besides the coframe one-form VJ°, the Lorentz connection one-form T°? = 
T,;°?dax’? = —I°* is a second field variable of the gauge theory. For a Riemannian 
space, torsion T® = 0 and I?’ becomes the Levi-Civita connection. 

We choose a model Lagrangian quadratic in torsion and curvature, in actual 
fact (for k= 1,c=1), 


1 1 
—_ (T° B T oy ee a 1 
V an AV’) Ax(Lg A Va) a A *Rop, (185) 


with Einstein’s gravitational constant « (dimension length-squared) and a dimen- 
sionless strong gravity coupling constant 9. One can calculate the two vacuum field 


a2Fixperimental limits of a possible torsion of spacetime were recently specified in a remarkable 
paper by Obukhov et al.,!4? see also the literature given there. 
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equations by varying with respect to J and '°’. In 1988, for these two field equa- 
tions, a Kerr metric with torsion!? 

We display here the orthonormal coframe and the torsion: The coframe 7%, in 
terms of Boyer—Lindquist coordinates (t,1r,@,@), reads (in the conventions used in 
Ref. 10) 


was found as an exact solution. 


‘VA 


o := pt + asin?6do), (186) 
= Par, 187 

VB oat 
v= ao, (188) 


VF 


0 := VF nena + (r? + a2)dd). (189) 


As before, we have p? := r? + a?cos? 6. However, the other structure functions pick 
up a cosmological constant A: 


ur? (0? La). (190) 


1 
F:=1+ 3a cos” 6, A:=r?+a?-—2Mr 


The corresponding metric is called a Kerr—de Sitter metric. The coframe is orthonor- 
mal. Then the metric reads 


g=-PaQGtHeawv+ avi Fos. (191) 


It is a characteristic feature of these exact solutions that even though the Lagrangian 
(185) does not carry a cosmological constant, in the coframe and the metric there 
emerges such a constant, namely \ := —30/(4«). This could be of potential impor- 
tance for cosmology. 

The torsion T° of this stationary axially symmetric solution of the Poincaré 
gauge theory reads (0° := 0% A 08) 


po a v9 4 Fa lva(o™ — 912) +. 9g (9%3 — 998)] — 20403], 


pi = 7, 
oe oe (192) 
T? = — lus 902 giz Ly 993 gis ; 
A p Shae. an aA An 
T3 = v 992 gi2 Ly 993 gis : 
Ta! 4( ) 5 ( I 
with the following gravitoelectric and gravitomagnetic functions: 
M Mr? 
v1, = —(r? —a’cos* 9), U5 = = ; (193) 
p p 
— Mea?rsin UO ee i eA oy ne Marcos 6 (194) 


p> p> ps 
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Metric and torsion of this exact solution are closely interwoven. Note, in particular, 
that the leading gravitoelectric part in the torsion, for small a, is ~M/r?, a definitive 
Coulombic behavior proportional to the mass. For a = 0, we find a Schwarzschild—de 
Sitter solution with torsion. 

One may legitimately ask, why is it that the Lagrangian (185) yields an exact 
solution with a Kerr—de Sitter metric? The answer is simple: The Lagrangian was 
devised such that the torsion square-piece, in lowest order in &, encompasses a 
Newtonian approximation. This is already sufficient in order to enable the existence 
of a Kerr—de Sitter metric. One could even add another torsion-square piece to V 
for getting an Einsteinian approximation, but this is not even necessary. Thus, only 
a Newtonian limit of some kind seems necessary for the emergence of the Kerr 
structure. 


4.2. Kerr metric in higher dimensions and in string theory 


There also exist Schwarzschild and Kerr metrics in higher dimensional spacetimes. 
These investigations are mainly motivated by supergravity and string theory. 

Tangherlini'®* (1963) started to investigate higher-dimensional Schwarzschild 
solutions, with n — 1 spatial dimensions. He studied the (“planetary”) orbits in an 
n-dimensional Schwarzschild field (“Sun”) and found that only for n = 4 we have 
stable orbits, see also Ortin.'4© According to Tangherlini, this is then the only case 
that is interesting for physics. Nowadays, however, many physicists hypothesize 
that higher dimensions do exist because string theory suggests it. 

Somewhat later, Myers and Perry!?® (1986) generalized these considerations 
to higher-dimensional Kerr metrics. In the meantime a plethora of such higher- 
dimensional objects have been found, see Allahverdizadeh et al.° and Frolov and 
Zelnikov.® Recently Keeler et al.°” investigated, in the context of string theory, the 
separability of Klein—Gordon or Dirac fields on top of a higher-dimensional Kerr 
type solutions. Lately Brihaye et al.,2! for example, discussed the exact solution of 
a 5D Myers—Perry black holes as coupled to a to a massive scalar field. The physical 
interpretations of these results remain to be seen. 
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Appendix 
A.l. Exterior calculus and computer algebra 


We want to use as input the Papapetrou metric (93). We take the equivalent rep- 
resentation in the form of the orthonormal coframe of Eqs. (94)—(98). How such a 
Reduce-Excale program can be set up, is demonstrated in Stauffer et al.!7* and in 
Socorro et al.,'7° for the Einstein three-form, see Heinicke®?: 


TPA AOAC IG ICICI CA A IK ICA A 1K 2K A A 
% Coframe of Andress-Lewis-Papapetrou-Buchdahl metric 
TPA GORA ACR CIC  ACIOI ACA A IK AK A A 1k 2k A 4 
% file Buchdahl03.exi, 29 July 2014, fwh & chh 

% in "Buchdahl03.exi"; 


load_package excalc; 

off exp$ 

pform f=0, omega=0, gamma=0 $ 

fdomain f=f(rho,z), omega=omega(rho,z), gamma=gamma(rho,z) ; 


coframe o(0) = sqrt(f) * (d t - omega * d phi), 
o(1) = sqrt(f)**(-1) * exp(gamma) * d rho, 
0(2) = sqrt(f)**(-1) * exp(gamma) * d z, 
0(3) = sqrt(f)**(-1) * rho * d phi 
with signature (1,-1,-1,-1); 


displayframe; 
frame e$ 


ATCT CCP CCTCCTrrrCrrterrrererrrrcrrrerrrrrrrerrerrrer se eet 24 
th Connection, curvature, and Einstein forms 

TTT CCPC CTCCTTrrcrrtrrrreretetrrrrererrrrrerrerers + eet 23 
pform conni(a,b)=1, curv2(a,b)=2$ 

antisymmetric conn1, curv2$ 

factor 0(0), o(1), 0(2), 0(3)$ 


conni(-a,-b) := (1/2)*( e(-a)_Id o(-b) - e(-b)_ld o(-a) 
- (e(-a)_| (e(-b)_Id o(-c))) * o(c))$ 


curv2(-a,b) := d conni(-a,b) - conni(-a,c) ~ conni(-c,b)$ 


% Einstein tensor = Einstein O-form 
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pform einstein3(a)=3, einstein0(a,b)=0$ 

symmetric einstein0O$ 

einstein3(-a) := -(1/2) * curv2(b,-c) *~ # (o(-a) ~ o(-b) * o(c))$ 
einstein0O(a,-b):= #( o(a) ~ einstein3(-b))$ 


on exp, gcd$ 
factor “$ 
on nero; 


einstein0O(a,-b):= #( o(a) ~*~ einstein3(-b)); 


off nero; 


% by inspection, we find 
einstein0O(1,-1) + einstein0(2,-2); % equals 0 
einstein0(0,-0) - einstein0(3,-3); % eliminates gamma 


out "Buchdahl103.exo"; 


load_package tri; 

on tex; 

on TeXBreak; 

einstein0O(a,-b) :=einsteinO(a,-b); 
off tex; 

einstein0O(a,-b) :=einsteinO(a,-b); 
omega: =0; 

einstein0O(a,-b) :=einsteinO(a,-b); 


shut "Buchdahl03.exo"; 
3;end; 
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Our topic concerns a long standing puzzle: The energy of gravitating systems. More 
precisely we want to consider, for gravitating systems, how to best describe energy— 
momentum and angular momentum/center-of-mass momentum (CoMM). It is known 
that these quantities cannot be given by a local density. The modern understanding is 
that (i) they are quasi-local (associated with a closed 2-surface), (ii) they have no unique 
formula, (iii) they have no reference frame independent description. In the first part of 
this work, we review some early history, much of it not so well known, on the subject 
of gravitational energy in Einstein’s general relativity (GR), noting especially Noether’s 
contribution. In the second part, we review (including some new results) much of our 
covariant Hamiltonian formalism and apply it to Poincaré gauge theories of gravity 
(PG), with GR as a special case. The key point is that the Hamiltonian boundary term 
has two roles, it determines the quasi-local quantities, and furthermore, it determines 
the boundary conditions for the dynamical variables. Energy-momentum and angu- 
lar momentum/CoMM are associated with the geometric symmetries under Poincaré 
transformations. They are best described in a local Poincaré gauge theory. The type 
of spacetime that naturally has this symmetry is Riemann—Cartan spacetime, with a 
metric compatible connection having, in general, both curvature and torsion. Thus our 
expression for the energy-momentum of physical systems is obtained via our covariant 
Hamiltonian formulation applied to the PG. 


Keywords: Quasi-local energy; Hamiltonian boundary term. 


PACS Number(s): 04.20.Cv, 04.20.Fy 
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1. Introduction 


How to give a meaningful description of energy-momentum for gravitating systems 
(hence for all physical systems) has been an outstanding fundamental issue since 
Einstein began his search for his gravity theory, general relativity (GR). It is deeply 
connected to the essential nature of not only geometric gravity but of all the funda- 
mental interactions — their inherent gauge nature. Noether’s paper that includes 
her two famous theorems relating global symmetries to conserved quantities and 
local gauge symmetries to differential identities was originally motivated by this 
very issue. She showed that gravitational energy has no proper local description. 
So investigators only found various expressions which were inherently nontensorial 
(reference frame dependent pseudotensors). They have two inherent ambiguities: 
(i) There are many possible expressions, (ii) they are noncovariant — reference 
frame dependent. The modern view is that energy-momentum is quasi-local (asso- 
ciated with a closed 2-surface). Quasi-local proposals have analogous ambiguities. 
These ambiguities can be clarified by the Hamiltonian approach. From a first-order 
Lagrangian for quite general differential form fields, we have constructed a space- 
time covariant Hamiltonian formalism, which incorporates the Noether conserved 
currents and differential identities. The Hamiltonian that dynamically evolves a 
spatial region includes a boundary term. The explicit form of the boundary term 
depends on the boundary conditions and also on an appropriate reference choice. 
With a suitable vector field, it gives expressions for the quasi-local quantities 
(energy-momentum, angular momentum/center-ofmass momentum, CoMM) and 
also quasi-local energy flux. A geometric gauge theory perspective provides the 
most appropriate dynamical variables. The geometry is Riemann—Cartan, with, in 
general, both curvature and torsion. For the PG (GR is a special case) with general 
source and gauge fields, we identified a preferred Hamiltonian boundary expres- 
sion along with a procedure for finding a “best matched” reference. With this 
one can obtain values for the quasi-local energy-momentum and angular momen- 
tum/CoMM.* 

Our topic here concerns the localization of energy-momentum. The main aim 
of our research program has actually been to better understand the Hamiltonian 
for dynamic spacetime geometry, especially the role of the Hamiltonian boundary 
term. It turns out that this sheds much light on the issue of the localization of 
energy.°® A number of different ideas will be fit together to give a good picture of 
this long standing puzzle. In addition to being mindful of Noether’s results, we will 
use a Hamiltonian approach combined with a local gauge theory view of dynamic 
spacetime geometry. 

This present work is largely just an application of Noether’s result. We will 
begin with some early history (much of it not so well known) regarding energy in 


@For an alternative to our Hamiltonian approach to energy-momentum and angular momentum 
for the PG see Ref. 66. 
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the context of GR, especially Noether’s contribution. Next, we will show how in 
GR. pseudotensors are connected with the Hamiltonian and introduce the quasi- 
local idea. Following this, we make some brief remarks about gravity, geometry, 
connection and gauge. We then introduce and give a short review of our main tool: 
differential forms. We develop in some detail variational principles with differential 
form fields. With this, we can give simple examples of applications of the two 
Noether theorems. We then introduce the first-order formalism followed by our 
Hamiltonian formulation and the 3+1 split. The Hamiltonian boundary term and its 
important roles are discussed next. Asymptotic fall offs for the fields are noted. We 
explain why Riemann—Cartan geometry is appropriate for our purposes. Variational 
principles for dynamic spacetime geometry with quite general sources are developed, 
including the Noether conserved currents and differential identities. We present a 
first-order and Hamiltonian formulation for the PG along with the Hamiltonian 
boundary term and identify our preferred Hamiltonian boundary term for these 
dynamic spacetime geometry theories. We also include a prescription for choosing 
the necessary reference values that are needed for the quasi-local energy-momentum 
and angular momentum expressions. 


2. Background 


As this present work approached its final form, we received some very good news: 
All of the volumes of the Einstein papers published so far> — both the originals 
and the English translations®® 
of Einstein’s dozens of papers on gravity during the period 1913-1918, as well as 
his extensive correspondence with his contemporaries on the topic of gravity, shows 
that most of them include a significant consideration of the topic of gravitational 


— are now freely available online.?” An examination 


energy. 


2.1. Some brief early history 


We have only begun to look into the historical development of the modern ideas 
regarding gravitational energy; the topic merits much deeper study. Here, we can 
only give a brief report, relying on the Einstein papers as well as some of the many 
good historical investigations available, in particular regarding energy-momentum 
conservation. 13:18:188 

We will rely on the Hamiltonian formalism applied to dynamical variables that 
are related to a local gauge theory of spacetime symmetry approach. (Earman®? 
has given a very interesting discussion on how the Hamiltonian approach connects 
with the gauge theory perspective.) 

It seems not so well known that gravitational energy, or more precisely the 
proper description of the energy of gravitating systems (i.e. all real physical sys- 


tems), has played a large role in the development of 20th century physics. 


bAt present up to 1923. 
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In the years 1912-1915, Einstein, when he was searching for satisfactory field 
equations, used a form of the equations that explicitly included an energy— 
momentum density for the gravitational field and were designed to satisfy the 
principle of energy-momentum conservation.© Thus, an expression for the Ein- 
stein energy-momentum pseudotensor already existed even before he found the 
correct field equations. It should be appreciated that general covariance brought 
with it features that had never before been encountered in any theory. (Indeed 
there is still controversy up to the present day.1®%+9°124) For a couple of years, 
Einstein very much doubted that a generally covariant theory could be found?; he 
proposed that energy conservation would select the preferred physical coordinate 
frame. Initially Hilbert followed Einstein in this belief (see the proofs of his first note 
in Ref. 107). 

Although Einstein began using variational principles in 1914%° this was not his 
path to the field equations. Hilbert was the first to identify a generally covariant 
Lagrangian® (proportional to the Riemannian scalar curvature). He also constructed 
(in a complicated way that was not easy to understand) his “conserved energy 
vector,” a vector with vanishing divergence associated with the general coordinate 
invariance (i.e. diffeomorphism invariance) of his Lagrangian. 

Einstein’s energy-momentum pseudotensor was criticized*® for giving “unphys- 
ical” values (Schrédinger'!® noted that one could choose the coordinates to give a 
vanishing value outside a fluid sphere, and Bauer? noted that one could choose the 
coordinates to give a vanishing energy value for empty Minkowski space). 

Lorentz, Levi-Civita and Klein argued that the Einstein curvature tensor Gj, 
was the only proper gravitational energy-momentum density; hence one should 
regard the Einstein equation in the form 


1 
—7 Guy + Tw = 0, (1) 


as describing the vanishing sum of gravitational and material energy-momentum. 
(This idea has been advanced more recently by Cooperstock.?°) In our modern 
perspective for GR their idea is quite correct — but a density is not the whole 
story. There is more to energy-momentum than just a density. 


2.2. From Einstein’s correspondence 


Here are some excerpts from Einstein’s correspondence concerning the Einstein 
pseudotensor, Hilbert’s energy vector and Noether’s contribution.!3-'4:16,7491,110 
They reveal the difficulties and the extent of understanding these people had at 
that time. All these are quoted from the Einstein papers?”** Vol. 8. 


°See Refs. 60-62, 93, 97 and 125 for discussions of how Einstein found his field equations.?4 
4One reason was his famous “hole” argument.6%97 

€Kinstein and Hilbert had quite different agendas!!1:114:136.141. Hilbert in his Foundation of 
Physics papers, based on the work of Einstein and Mie, was using his axiomatic method with 
the objective of finding a unified field theory of gravity and electromagnetism. 15:27:57 107,108,113 
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“Highly esteemed Colleague, ... I am sitting over your relativity paper, ..., 
and am honestly toiling over it. I do admire your method, as far as I have 
understood it. But at certain points I cannot progress and therefore ask 
that you assist me with brief instructions. ... I still do not grasp the energy 
principle at all, not even as a statement” (Doc. 221 to Hilbert 25 May 
1916). 


“Your explanation of Eq. (6) of your paper was charming. Why do you 
make it so hard for poor mortals by withholding the technique behind your 
ideas? ... In your paper everything is understandable to me now except 
for the energy theorem. Please do not be angry with me that I ask you 
about this again. ... How is this cleared up? It would suffice, of course, if 
you would charge Miss Noether with explaining this to me” (Doc. 223 to 
Hilbert 30 May 1916). 


“My t#’s are being rejected by everyone as unkosher” (Doc. 503 to Hilbert 
12 April 1918). 


“'.. Only (24) is an identity ... The relations here are exactly analogous 
to those for nonrelativistic theories” (Doc. 480 to Klein 13 March 1918). 


“T have succeeded in discovering the organic formation law for Hilbert’s 
energy vector” (Doc. 588 from Klein 15 July 1918). 


“The only thing I was unable to grasp in your paper is the conclusion at 
the top of page 8 that €7 was a vector” (Doc. 638 to Klein 22 Oct 1918). 


“Thank you very much for the transparent proof, which I understood com- 
pletely” (Doc. 646 to Klein 8 Nov 1918). 


“.,. Meanwhile, with Miss Noether’s help, I understand that the proof for 
the vector character of e% from “higher principles” as I had sought was 
already given by Hilbert on pp. 6, 7 of his first note, although in a version 
that does not draw attention to the essential point” (Doc. 650 from Klein 
10 Nov 1918). 


Briefly, after a couple of years Klein clarified Hilbert’s energy-momentum “vec- 
tor”; he related it to Einstein’s pseudotensor, but (as we will discuss in more detail 
shortly) disagreed with Einstein’s physical interpretation of divergenceless expres- 
sions. Enlisted by Hilbert and Klein, it was Emmy Noether who solved the primary 
puzzle regarding gravitational energy. 


fFor these investigators, these things were not as easy as they are for us today; in particular the 
Bianchi identity and its contracted version were not known to these people,?”!!? so they had, in 
effect, to rediscover that identity from effectively the diffeomorphism invariance of the Lagrangian. 
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2.3. Noether’s contribution 


If one had to describe 20th century physics in one word, a good choice would be 
symmetry. Most of the new theoretical physics ideas involved symmetry. Essentially 
they can be seen as applications of Noether’s theorems. Briefly, Noether’s first the- 
orem associates conserved quantities with global symmetries, and Noether’s second 
theorem concerns local symmetries: It is the mathematical foundation of the mod- 
ern gauge theories. Unfortunately her work® was largely overlooked for about 50 
years.’4 Why did Noether make her investigation? To clarify the issue of gravita- 
tional energy. 

Klein was looking into Einstein’s theory and the relationship between Einstein’s 
pseudotensor and Hilbert’s energy vector. Some of the correspondence between 
Hilbert and Klein was published in a paper.”! We quote some excerpts”: 

Klein wrote 


You know that Miss Noether advises me continually regarding my work, 
and that in fact it is only thanks to her that I have understood these 
questions. When I was speaking recently to Miss Noether about my 
result concerning your energy vector, she was able to inform me that she 
had derived the same result on the basis of developments of your note 
(and thus not from the simplified calculations of my section 4) more 
than a year ago, and that she had then put all of that in a manuscript 
(which I was subsequently able to read). She simply did not set it out 
as forcefully as I recently did at the Mathematical Society (22 January 
[1918]). 


Hilbert responded 


I fully agree in fact with your statements on the energy theorems: Emmy 
Noether, on whom I have called for assistance more than a year ago to 
clarify this type of analytical questions concerning my energy theorem, 
found at that time that the energy components that I had proposed — 
as well as those of Einstein — could be formally transformed, using 
the Lagrange differential equations (4) and (5) of my first note, into 
expressions whose divergence vanishes identically, that is to say, without 
using the Lagrange equations (4) and (5). 


Also 


Indeed I believe that in the case of general relativity, i.e. in the case of 
the general invariance of the Hamiltonian function, the energy equations 
which in your opinion correspond to the energy equations of the theory 


&For discussions see Refs. 13, 14, 16, 74, 91 and 110. 
hWe do not know of any English translation of Klein’s papers; the translations of the following 
excerpts are quoted from Ref. 74, p. 66. 
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of orthogonal invariance do not exist at all; I can even call this fact a 
characteristic of the general theory of relativity. 


What Hilbert here calls the Hamiltonian function we now refer to as the 
Lagrangian. Noether wrote her 1918 paper to clarify the situation. 


2.4. Noether’s result 


Many have heard of Noether’s theorems, but the full scope of what she actually 
did is not so generally well known. So, we take this opportunity to quote (from the 
highly recommended book of Kosmann-Schwartzbach™) in full her key results. (It 
should be mentioned that her Lagrangians were quite general, they could depend 
on any finite number of derivatives.) 


Theorem I. [f the integral I is invariant under a (finite continuous group with 
p parameters) Gp, then there are p linearly independent combinations among the 
Lagrangian expressions which become divergences — and conversely, that implies 
the invariance of I under a (group) Gp. The theorem remains valid in the limiting 
case of an infinite number of parameters. 


Theorem II. [f the integral I is invariant under a (an infinite continuous group) 
Goop depending on arbitrary functions and their derivatives up to order o, then 
there are p identities among the Lagrangian expressions and their derivatives up to 
order o. Here as well the converse is valid. 


Furthermore she has another important result, although it follows easily from 
Theorem II, in our opinion, both because of its importance and the fact that it 
was the key issue motivating her investigation, it could have been set off and called 
Theorem III: 


Given I invariant under the group of translations, then the energy relations are 
improper if and only if I is invariant under an infinite group which contains the 
group of translations as a subgroup. 


Regarding this latter result, she ends her paper with the remarks 


As Hilbert expresses his assertion, the lack of a proper law of energy 
constitutes a characteristic of the “general theory of relativity.” For that 
assertion to be literally valid, it is necessary to understand the term 
“general relativity” in a wider sense than is usual, and to extend it to 
the aforementioned groups that depend on n arbitrary functions.?” 


The footnote that ends her paper is also of interest: 


27 This confirms once more the accuracy of Klein’s remark that the term 
“relativity” as it is used in physics should be replaced by “invariance with 
respect to a group.” 
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Her result regarding the lack of a proper law of energy applies not just to 
Einstein’s GR theory, but in fact to all geometric theories of gravity: for all such 
theories, there is no proper conserved energy-momentum density. 

As a well-known textbook expresses it: 


Anyone who looks for a magic formula for “local gravitational energy— 
momentum” is looking for the right answer to the wrong question. 
Unhappily, enormous time and effort were devoted in the past to trying 
to “answer this question” before investigators realized the futility of the 
enterprise (Misner, Thorne and Wheeler, Gravitation,®? p. 467). 


3. The Noether Energy—-Momentum Current Ambiguity 


Let us begin our technical discussion by first reviewing some background. 

As we will soon show, a well-known result is that from a classical field Lagrangian 
density, £(p~4,0, ya), via Noether’s first theorem, the translational symmetry of 
Minkowski spacetime leads to a simple formula for the “conserved” canonical 
energy-momentum density 


OL 


TE, = 0k 
On A 


OL Pa. (2) 
The divergence of this expression satisfies the identity! 
OnT", = ——O¥a, (3) 
PA 


which can easily be directly verified using the definition of the Euler-Lagrange 


variational derivative 
6L OL OL 
= 0 : 4 

dpa Opa” Cam, 


The canonical energy-momentum density is a conserved current in the sense that 
“on shell” (i.e. when the Euler-Lagrange field equations are satisfied: 6£/éy4 = 0) 
its divergence vanishes. 

The above energy-momentum density has the usual conserved current 
ambiguity: 


= Te, + 60, (5) 


is likewise conserved but defines different energy-momentum values. Essentially, 
one can always adjust by a “curl” a divergence free current. 


1A consequence of assuming that the Lagrangian depends on position only through the field y4. 
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At first thought, one might be inclined to follow the rule of sticking with the 
results obtained directly from the Lagrangian and the above formula. But sometimes 
the results so obtained are not so suitable physically. 

A simple example in Minkowski space is the Lagrangian density 


1 
L= Pe Fw, Puy = On Ay _ WA: (6) 


If one regards this Lagrangian density according to the above paradigm as a function 
of A, and 0,,A,, then the above formula leads directly to the conserved expression 


1 
TH, = FH0, Ag — GUE Fog. (7) 


Now the above Lagrangian density (up to a suitable overall scaling coefficient that 
can be set aside for the purposes of this section) can be used to describe Maxwell 
electrodynamics. As is physically appropriate, this Lagrangian density and the field 
equations obtained from it are gauge invariant under the local “gauge” transfor- 
mation A, — A, + 0,x. However, the above canonical energy-momentum den- 
sity is not gauge invariant (nevertheless, as we will see later, it can still be useful 
physically). Naturally, one would generally prefer to have a gauge invariant energy— 
momentum density for electrodynamics. In this particular case, there are several 
ways that one can find an alternative to (7): (i) One can exploit the abovementioned 
freedom (5) and thereby find “by hand” an “improved” gauge invariant expression, 
(ii) one could regard the Lagrangian as being a function of a one-form A and its 
differential (this is really the proper way to treat electromagnetism — but then one 
needs an extension of the above classical field theory formalism that can accommo- 
date form fields; we will discuss such a formalism below) or (iii) one can consider 
that physically any time one has material energy-momentum one must also have 
gravity: The gravitational equations will include an unambiguous gauge invariant 
energy-momentum density. From a specific gravity theory, one gets a specific for- 
mula for the energy-momentum density. In this way, the ambiguity in the canonical 
energy-momentum density for any classical field can be entirely removed when we 
consider their gravitational effects. Specifically, in the case of GR, knowing the 
curvature gives, via the Einstein tensor, the symmetric Hilbert energy-momentum 
density. In particular for the electromagnetism example this is 


1 
TH, = FY Fy — 76y FP Fag, (8) 


which is no doubt a good choice for the energy-momentum density for Maxwell 
electrodynamics; in fact it is, as we shall see, the same as the energy-momentum 
density that one obtains by regarding the vector potential as a one-form. 


JThe existence of a gravitational field will reveal the location of a source with energy-momentum 
even if it has not otherwise been detected. An important example of this is — assuming gravity 
is well described by GR — from astronomical observations it seems that there is a large amount 
“dark matter” in the universe. Clearly, the issue of the proper description of energy for gravitating 
systems can have major consequences for our conception of the physical world. 
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While for electromagnetism one has another criteria (gauge invariance) that can 
be used to arrive at a physically suitable energy-momentum density, for most other 
sources one can only turn to gravity to identify a unique energy-momentum density. 
Hence, gravity uniquely detects the energy-momentum density of its sources. It 
may thus seem somewhat ironic that for gravity itself there is no proper energy— 
momentum density. 


4. Pseudotensors 


The Einstein Lagrangian for GR differs from the Hilbert scalar curvature 
Lagrangian by a certain total divergence which removes the 2nd derivatives of the 
metric: 


26LE (Gas, OuGap) = TV = 99° TT prone 
= J—gR — 0,(/—99°° T3642) 
=: 2sL11(Ga8, uGa2; Our Gap) — OuK. (9) 


They give the same field equations. The Einstein pseudotensor can be obtained 
from Lp using the aforementioned formula for the canonical energy-momentum 
tensor (2): 


OTE cigs (10) 


(e, = OLE = Vv 
is OO. Ja 


(Here, following tradition, gothic letters indicate densities.) We have from (3) using 
the Einstein equation 


bLp 


by 
On (tev) —_ O9ap 


Bv9ap = —(2k)~*/—gG° 8, gap = — 558, G48. (11) 

Hence, using the vanishing covariant divergence of the material energy-momentum, 
0 = Vyu(T") = u(y) — Pp BHy = u(y) — 52 O,go9, (12) 

we obtain 

0, (54, + tr) = 0, (13) 


a vanishing ordinary divergence, i.e. a conserved total energy-momentum “current.” 
Here, we assumed the vanishing of the covariant divergence of the material energy— 
momentum tensor and used Einstein’s equations to obtain an ordinary divergence 
conserved current. But one can argue the other way around, as Einstein did in 1916. 


kHere, Tey 2= 5g™ (069+ + 9798 — 998+) is the well-known Christoffel/Levi-Civita connec- 


tion, 565 = 255390) and « := 87G/c*. 
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4.1. Einstein, Klein and superpotentials 


Einstein®® obtained results of the form 


Ou(T", + thy) = 0, (14) 
TH +t, = Aas, (15) 
ene, = 0, (16) 

OL 
BA = ght. 17 
ae Da,g’ (17) 


Klein regarded the first three relations as mathematical identities and argued that 
energy-momentum conservation in GR was fundamentally different from that in 
classical mechanics.!*;"! Einstein did not agree with either of these statements; 
he regarded only (16) as an identity, which he obtained using a general coordi- 
nate invariance argument. Now taking the divergence of (15) using (16) gives (14), 
which — reversing the computation in the previous subsection — leads to (12). 
In this way Einstein showed that the local coordinate invariance identity plus his 
field equations — which are equivalent to (15) — gives the conservation of mate- 
rial energy momentum, without ever having to use any matter field equations. This 
type of argument is referred to as automatic conservation of the source (see MTW,*®? 
Sec. 17.1); effectively it uses a Noether second theorem type of argument to obtain 
current conservation. Weyl used the same type of argument for the conservation 
of the electromagnetic current in his seminal gauge theory papers,!4”!4* whereas 
modern field theory books generally use Noether’s first theorem in connection with 
current conservation.!? 

The identity (16) is equivalent to the contracted Bianchi identity, V,G", = 0, 
where G", is the Einstein curvature tensor. In those days the Bianchi identity, 
Vin? vy] = 9, was not generally well known (it was first used in GR by Levi-Civita 
in 19171!*). For any Lagrangian constructed out of the metric and its derivatives, 
it is now well known that local diffeomorphism invariance (with dg,, = Leguv = 
Vuév + VL.) of the associated action leads to a divergence identity: 


OF gy, clits =0=> vo = 0. (18) 

Our Our 
In general such identities can involve higher derivatives of the curvature, however 
for the Hilbert scalar curvature Lagrangian of GR this Noether second theorem type 
argument yields a divergence which coincides with the contraction of the Bianchi 
identity. 

By the way, Einstein had been using essentially the set of energy-momentum 
conservation relations (14)—(17) for some years in connection with the (noncovari- 
ant) “Entwurf” equations that he worked out with Marcel Grossmann.®? However, 
his Lagrangian for that scheme (see CPAE,?”° Vol. 6, Doc. 2) was not — up to 
an exact differential — diffeomorphically invariant, so that (16) was in that case a 
relation that selected a preferred set of coordinates. 
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Since for the Einstein Lagrangian (16) is an identity, one might wonder why 
did Einstein (and others) favor an invariance argument rather than just directly 
calculating it? It turns out that the invariance argument is considerably easier. Let 
us look into this a little further. The detailed form of (17) can be written out with 
the help of some formulas given by Tolman.'3” From his Eqs. (87.5) and (89.9), we 
have 


il 
g,, = —g"4*W va + 509 Wop, (19) 
OLE 7 
Wg = ae Prag + OF yo: (20) 


To show directly that this expression satisfies (16) is not short or simple. The only 
published calculation that we know of is in Maller,®? some 40 years later. 

If s“’,, were antisymmetric in its upper indices the identity would be trivially 
satisfied. Stated another way, if (14) is to be satisfied then there should exist a 
superpotential U>,, = UAL, such that 


of, =o, (21) 


Einstein’s s,, is not antisymmetric; it is not the right kind of potential. A suit- 
able superpotential for the Einstein pseudotensor was found over 20 years later by 
Freud*?: 
N 
UE, = —gP°T 9 dE (22) 


aoVv* 


To obtain this, he did not follow Einstein’s path. He started with the basics, the 
Einstein equations, and rearranged them using some formulas from Weyl’s book!*® 
and some complicated identities he found in Pauli’s 1921 encyclopedia article.%? 
Later, we will give a simple derivation of Freud’s superpotential using a better 


technique. 


4.2. Other GR pseudotensors 


The presence of a nonvanishing energy-momentum density necessarily produces 
gravity (i.e. the curvature of spacetime). In curved spacetime, the total source 
energy-momentum tensor satisfies (12). Without the second term we would have an 
expression suitable for integrating to obtain a conservation law. The second term 
represents a local interaction exchanging energy-momentum between the source 
and the gravitational field. To have a good conservation law, we would like to 
rewrite (12) in the form of (14) for some suitable gravitational energy-momentum 
density t“,. In fact this can be done in an infinite number of ways, and, in all cases, 
the quantity t“, is not a tensor. (For some good overviews of such pseudotensors 
and their properties see Refs. 44, 64, 130 and 138.) 

Here is a construction. Select an object (referred to as a superpotential) with 
suitable symmetries: U’,, = yal y- Now use it to split the Einstein tensor, defining 
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a gravitational energy-momentum pseudotensor according to 
Dut?) = — 2/9 Gt, SOU. (23) 


Then Einstein’s equation, G4, = «T",, takes a form (analogous to Maxwell’s 
equation) with a total effective energy-momentum pseudotensor as its source: 


OU, = We(t4, + T4,). (24) 


The essential feature of such source expressions is that they are equated to a deriva- 
tive of a superpotential in such a way that their divergence automatically vanishes. 
Thus, as a consequence of the symmetry of {“*, we have (similar to Maxwell’s 
theory) “automatic conservation of the source”: 0,,(f", + t“,) = 0. This expres- 
sion can be integrated to define the total conserved energy-momentum within any 
volume V: 


1 1 1 
—_ UL LU 3 = BA 43 = pr ~*~ 32 . 
P,(V): [e pte. a | me ly =f Wi, 5d? Sur 
(25) 


From the volume integral on the left, one would expect that the results would 
be highly ambiguous — depending on the choice of reference frame throughout 
the volume of interest. However, from the last surface form, one can see that the 
situation is not nearly so bad. The result does not depend on the choice of reference 
frame within the volume, it is quasi-local, i.e. it depends on the fields and choice 
of reference frame only on the boundary. It should be noted that (for any given 
reference frame on the boundary) the value of P,(V) is well defined by the above 
integral. Its value, however, comes from a mixture of physics and a quasi-local 
reference frame; still it can be useful if one is mindful of its nature. 

There are some variations on the above formulation. The classical pseudotenso- 
rial total energy-momentum density complexes, 7“, := T", + t#,, all follow from 
suitable superpotentials according to one of the patterns 


QKTH, = OW, 2TH = OU”, WeTHY = Dap hl”, (26) 


where the superpotentials have certain symmetries which automatically guaran- 
tee conservation: Specifically W?, = Url, ear = ylHAlv while 9°" has 
the algebraic symmetries of the Riemann tensor (this latter form yields a sym- 
metric pseudotensor and, hence, a simpler conservation of angular momentum 
description, see MTW Ref. 82, §20.3). We have already considered the Einstein 
total energy-momentum density which follows from the Freud superpotential (22). 
For completeness, we list the other well-known ones. The Bergmann—Thompson,® 
Landau-Lifshitz,”’ Papapetrou,?® Weinberg!*° (also used in MTW®?) and Moller®? 


total energy-momentum complex expressions can be obtained respectively from 


ry gO UH (27) 
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ute” = |g|2UBA”, equivalently HoH” := |g|dH%g%°g™, (28) 
HoHOY .— Guo GYBGab(|g|zg™™), (29) 
a Vv a4 Vv a, ) =mcr—n =mn =C 
SGehY = Sle 5rP |g|2 9 °(- greg + 5 g _) Gedy (30) 
aN ona _ 
UNA, = —|g|2.9°7P? ade = |g]? g°"G** (Oagsy — Osgav). (31) 


Here, 9%” is the Minkowski metric, all indices in these expressions refer to spacetime 


and range from 0 to 3, otherwise our conventions follow MTW.®*? 

People have often looked askance at such pseudotensors, e.g. the above quote 
from MTW and Schrédinger refers to them as “sham.”!*" As we noted, there are no 
doubt two unsatisfactory aspects: (i) Which of the many possible expressions should 
one use? (ii) and which quasi-local (in view of (25)) reference frame should be used. 
On the other hand, one should also be mindful that (a) they do provide a description 
of energy-momentum conservation, (b) they (like connection coefficients) really are 
geometric objects, with well defined values in each reference frame (this issue has 
been rigourously addressed using fiber bundle formulations??:104:19°), 

All of these pseudotensors (except for Maller’s) give the expected total energy— 
momentum values at spatial infinity. On the other hand, none of them give 
the desired positivity of energy for small vacuum regions to lowest nonvanishing 
order,'!*! however a set of new pseudotensors depending on several parameters with 
this desirable property has been constructed.!2° How can one understand the phys- 
ical significance of these various pseudotensors? We have found a way using the 
Hamiltonian approach. 


4.3. Pseudotensors and the Hamiltonian 


To see how one can be led to the Hamiltonian, one merely needs to redo the cal- 
culation of (25) as an identity (“off shell”). For some fixed reference frame, with a 
(constant in the present reference frame) vector field Z“ inserted we find! 


—Z"P(V) := -{ ZL” /—gd Dy 
Vv 


1 1 
=| z"v=a Gaz 7) 5 (AA) | a ee 
V 


= | ZHHER + f BAZ = BV. (32) 
Vv S=0V 


Here, HE" can be recognized as the covariant expression which, when expressed in 
terms of the appropriate canonical variables, is just the ADM Hamiltonian density 


IThe sign in this expression is dictated by the condition for positive energy determined by the 
Hamiltonian using our local Minkowski signature convention: P,, = (—E/c,p). 
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(ie. the superhamiltonian and supermomentum), see, e.g. Refs. [1, 59] and MTW*? 
Chap. 21. The expression includes a Hamiltonian boundary term, a 2-surface inte- 
eral of BCR(Z) = —Z"(1/2n)U",,(1/2)d?S_y, ie. it is entirely determined by the 
superpotential. The value of the Hamiltonian on a solution is entirely determined 
by this boundary term; the initial value constraints ensure that the Hamiltonian 
density in the spatial volume integral vanishes “on shell” (i.e. when the field equa- 
tions are satisfied). In a similar way the value given by any pseudotensor can be 
regarded as the value of the Hamiltonian with a certain boundary term.!? From 
the Hamiltonian variation, as we will discuss below, one gets important information 
that tames the ambiguity in the boundary term — namely boundary conditions — 
and thereby determines the physical significance of the various quasi-local values. 
The energy-momentum values obtained for the various pseudotensors can all be 
regarded as values of the Hamiltonian with different boundary conditions. 


5. The Quasi-Local View 


The modern idea, due to Penrose! in 1982, is that energy-momentum is quasi- 


local: i.e. it is associated with a closed 2-surface (while the pseudotensor energy— 
momentum complexes always had this property, its essential importance became 
much more appreciated after this work of Penrose which introduced this convenient 
term). There is a comprehensive review of this topic: Szabados (2009).13! The 
many recent works cited in this review show that this is still a topic of considerable 
interest. In a brief summary one can find the statement: 

“'.. contrary to the high expectations of the 1980s, finding an appro- 
priate quasi-local notion of energy-momentum has proven to be surpris- 
ingly difficult. Nowadays, the state of the art is typically postmodern: 
Although there are several promising and useful suggestions, we not 
only have no ultimate, generally accepted expression for the energy— 
momentum and especially for the angular momentum, but there is not 
even a consensus in the relativity community on general questions... or 
on the list of the criteria of reasonableness of such expressions.” 


However if one takes a more specific approach, one can come to a more satisfac- 
tory conclusion. In particular, the Hamiltonian view quite changes the prospects, 
especially when used along with a gauge perspective. 


6. Currents as Generators 


Noether’s work was entirely Lagrangian based. Her results can be taken a further 
step when they are combined with the Hamiltonian formulation. As we will see, the 
Hamiltonian formulation offers a handle on the Noether current ambiguity. 
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One key feature can be seen already in Hamiltonian mechanics. A quantity Q 
conserved under the time evolution generated by a Hamiltonian H = H(q,p) is 
more than just a conserved quantity, it is also the canonical generator of a one 
parameter transformation on phase space (q(A), p(A)) which is a symmetry of the 
Hamiltonian. 

dQ dH 
0= = (0, H] > =[H,.Q)=0. (33) 

In Hamiltonian field theory, the conserved currents are the generators of the 
associated symmetry. In particular, the generator of a local spacetime “transla- 
tion” (an infinitesimal diffeomorphism) is the Hamiltonian; energy-momentum is 
the associated conserved quantity. Conversely, for spacetime translations, the asso- 
ciated Noether conserved current expression (i.e. the energy-momentum density) 
is the Hamiltonian density — the canonical generator of spacetime displacements. 
As we will see, because it can be varied this translation generator gives a handle 
on the associate conserved current ambiguity. The Lagrangian formulation affords 
no such handle, because in terms of Lagrangian variables the translation current is 
not a generator that can be varied. 


7. Gauge and Geometry 


For the early history of gauge theory see O’Raifeartaigh.°° Briefly, the mile- 
stone works are Hermann Weyl’s treatments of electromagnetism: Weyl (1918),'4” 
Weyl (1929),!48 then the generalization to non-Abelian groups by Yang and Mills 
(1954)!>4 and Utiyama (1956, 1959).189-4° Explicitly treating gravity as a gauge 
theory was pioneered by Utiyama,!®*!4° using the Lorentz group and Riemannian 
geometry. Sciama!! also used the Lorentz group but with Riemann—Cartan geome- 
try (ie. nonvanishing torsion). Kibble®’ put things in their proper place, he gauged 
the Poincaré group (i.e. the inhomogeneous Lorentz group, including translations). 

For accounts of gravity as a spacetime symmetry gauge theory, see Hehl and 
coworkers,*°:52 54:56 Mielke®° and Blagojevi¢.” A comprehensive reader with sum- 
maries, discussions, and many reprints has recently appeared: Blagojevié and Hehl.® 
For the observational constraints on torsion see Ni (2010). 

To us it is rather surprising that the idea of regarding gravity as a gauge the- 
ory is not better known. Examined more closely, one finds that gravity played an 
important role in the argument used in both of the above mentioned seminal works 
of Weyl, and thus in all of the above — except for the Yang—Mills paper. Further- 
more, later in 1974 Yang himself published a paper!°? 
treatment of gravity as a gauge theory.™ 


where he proposed a certain 


™The aforementioned reader includes a chapter with a critical discussion of Yang’s gauge theory 
of gravity. Recently Yang was asked about his 1974 paper; he said: “I do not believe that paper 
is correct.” 153 
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According to our understanding, properly speaking, GR can be understood 
as the original gauge theory. After all, it was the first physical theory where 
local gauge freedom (in the guise of general coordinate invariance) played a key 
role.” 

The conserved quantities, energy-momentum and angular momentum/CoMM 
are associated with the geometric symmetry of Minkowski spacetime, the spacetime 
translations and Lorentz rotations, i.e. the Poincaré group. Furthermore this group 
is used to classify physical particles according to mass and spin. So a local Poincaré 
gauge theory is quite appropriate both geometrically and physically. 

To give a good account, one should also be mindful of the parallel development 
of the closely related concept of a connection in differential geometry. Here, we just 
briefly mention that the main ideas were due to Hessenberg, Levi-Civita, Schouten, 
Weyl, Cartan, Ehresmann, and Koszul; for discussions of connections see Nomizu,®” 
Kobayashi and Nomizu” and Spivak.!?% 

As we will see in more detail, Riemann—Cartan (with a metric and a metric 
compatible connection, having both curvature and torsion) is the most appropriate 
geometry for a dynamic spacetime geometry theory: its local symmetries are just 
those of the local Poincaré group. So in this presentation, we will be considering the 
Poincaré gauge theories of gravity (PG); GR. is included in this class as a special 
case. 


8. Dynamical Spacetime Geometry and the Hamiltonian 


We will consider geometric gravity theories with both a metric and an a priori met- 
ric compatible connection. Both curvature and torsion are allowed. The variational 
principles are developed. The Noether symmetries and the associated conserved 
quantities and differential identities are discussed. From a first-order Lagrangian 
formalism using differential forms, we construct a spacetime covariant Hamiltonian 
formalism. The Hamiltonian boundary term gives appropriate expressions for the 
quasi-local quantities, energy-momentum, angular momentum and CoMM, as well 
as quasi-local energy flux. The formalism easily specializes to teleparallel theory 
and Einstein’s GR. 

The Hamiltonian approach reveals certain aspects of a theory, including the 
constraints, gauges, and degrees-of-freedom, as well as expressions for energy— 
momentum and angular momentum. However, the usual ADM approach achieves 
this at a heavy cost: The loss of manifest 4D-covariance. Our alternative approach 
is complementary: A major benefit is manifestly 4D-covariant expressions for the 
quasi-local quantities: Energy-momentum and angular momentum/CoMM. 


"It is true that the electrodynamics potentials along with their gauge freedom were known long 
before GR, (in fact a Lagrangian which is locally gauge invariant had already been presented!8), 
but this gauge invariance was not seen as having any important role in connection with the nature 
of the interaction, the conservation of current, or a differential identity — until the seminal work 
of Weyl, which post-dated (and was inspired by) GR. 
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8.1. The main ideas 


The Hamiltonian for physical systems and dynamic spacetime geometry generates 
the evolution of a spatial region along a vector field. It includes a boundary term 
which determines the boundary conditions and supplies the value of the Hamilto- 
nian. The Hamiltonian value gives the quasi-local quantities: Energy-momentum 
and angular momentum/CoMM. A spacetime gauge theory perspective identifies 
suitable geometric variables. We found a certain preferred Hamiltonian bound- 
ary term. The Hamiltonian boundary term depends not only on the dynamical 
variables but also on their reference values; they determine the ground state — 
the state with vanishing quasi-local quantities. To determine the “best matched” 
reference metric and connection values for our preferred boundary term, we pro- 
pose on the boundary 2-surface: (i) 4D isometric matching and (ii) extremizing 
the energy. 


8.2. Some comments 


Before we begin our technical discussion of our work, let us make a few general 
comments. We work in 4D spacetime, but most of this can be extended to other 
dimensions in a straightforward fashion (except for the reference construction). 
The class of dynamical Lagrangians that we will consider does not allow for any 
derivatives of curvature or torsion. Our concerns are entirely classical. 

We focus here on Riemann—Cartan spacetime geometry (i.e. spacetimes with 
a metric and a metric compatible connection, having both curvature and torsion) 
and the PG; our general analysis can be specialized both to Riemannian geome- 
try (vanishing torsion) and teleparallel geometry (vanishing curvature); it includes 
GR and the teleparallel equivalent of GR as two special cases. Here, we assume a 
metric compatible connection; elsewhere we will present the generalization which 
includes nonmetricity. The extension to nonmetricity and the special case of telepar- 
allel geometry each offer further insight into gravitational energy; we believe those 
insights are best appreciated when compared to the results presented here for the 
Riemann—Cartan geometry with the PG. 


9. Differential Forms 


In this work, we mainly use differential forms.4°°!4? The reader may wonder why 
we use this less widely familiar idiom. The simple brief explanation is that they 
have some qualities that are technically very convenient for our needs. Differential 
forms are multiplied using the (graded Grassmann) wedge product. They can be 
differentiated using d, the exterior differential, a graded derivation which enjoys the 
property d? = 0, so a differential equation da = ( has the integrability condition 
0 = df, furthermore dG = 0 > £ = da, at least locally. The integrals of forms 
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satisfy the general boundary theorem® 


/ dB=o 8B. (34) 
U OU 


Also, they are well suited to representing interacting physical fields, especially gauge 
fields, and, as we shall see, they give a succinct representation of the main geometric 
objects: Connection, curvature, coframe, torsion, etc. Moreover, as will be explained 
below, they are quite convenient for the essential 3+ 1 spacetime decomposition of 
derivatives that is needed for a dynamical Hamiltonian formulation.*4 

Regarding notation, here the contraction (or interior product) with a vector field 
is denoted by ixa(-,-,...) := a(X,-,...) (some authors use the notation of left con- 
traction: X |). The Lie derivative on forms is given by £x = ixd-+dix, it has the 
nice property d£x = £xd. We are concerned here with the case of 4D spacetime, 
which has a local Minkowski structure, having a metric with Lorentz signature. 
The metric determines the unit volume 4-form 7 with components Nivag = Nap); 
0123 = Jal which is used to construct the Hodge dual that maps k-forms to 


(4 — k)-forms. 

From the coframe J, one can construct a basis for k-forms 0°?" := V° AVF A--- 
and a useful dual basis 7°?" := *3%9-. They are related by various identities, 
especially 

age A Nuvr = OF His On Nwr Oo nus (35) 
gr? nr TNwr = Joe ny | 5N ny oy (36) 
OA Quy = OoNy — OnNv- (37) 


Maxwell’s electrodynamics is a good example of the utility of differential forms. 
Charge identifies the charge-current 3-form (density): 


qv) = fs (38) 
V 
Charge is conserved: 
dJ =0=> J =ddH. (39) 


The electromagnetic field is represented by a 2-form F'. An elementary way to 
see why this is appropriate is to examine the motion of a point test charge. One 
should begin with kinematics in Minkowski space. Consider the motion of a point 
particle as a function of proper time: 7“ = x(r). The 4-velocity v4 := da /dr has 
constant magnitude: v/v, = —c? so the 4-acceleration is Lorentz orthogonal to the 
4-velocity. Hence, the 4-force must be orthogonal to the 4-velocity. Consequently 


°This generalization of the fundamental theorem of calculus is often referred to as the general- 
ized Stokes theorem. Special cases include the Ostrogradsky—Gauss and Stokes theorem of vector 
analysis. 
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the 4-force must depend on the velocity. The simplest case is for a 4-force linear in 
the 4-velocity. Thus the simplest dynamical law has the form 


dp 
ce = gFyvv", (40) 
where p, := mv, is the 4-momentum, gq is a coupling constant and F,, is some 


tensor field which is antisymmetric, i.e. it is a 2-form. The Lorentz force law of 
electrodynamics has this form. The force law identifies a certain field strength 2- 
form F’ which includes the electric and magnetic fields. Conservation of magnetic 
flux through a closed 2 surface S = OV gives 


o=¢F={ aF, dF =0eF =dA, (41) 
Ss Vv 


and there are local gauge transformations: A — A+ dx. The vacuum constitutive 
relation is H = *F'/Zp (Zp is the vacuum impedance). This covariant formulation 
for Maxwell’s electrodynamics is valid for all dynamic geometry gravity theories and 
does not depend upon using a particular set of units, for a detailed, comprehensive 
and instructive presentation see Hehl and Obukhov.®° 


10. Variational Principle for Form Fields 


Why do we use variational principles?’® The answer is pragmatics: Because they 
work. With appropriate symmetries, they give consistent interacting field equations 
along with conserved Noether currents for all the desired quantities. As far as 
we know, all the known good dynamical evolution equations for the fundamental 
interacting classical field theories have a variational formulation. 

In the usual formulations, most dynamical fields satisfy second-order equations. 
We refer to such formulations as second-order. 

Let vy“ be some kind of vector field. The label “A” stands for some collection 
of indices, e.g. spinor, spacetime, isospin. Allow v4 to also be a differential form of 
rank f where f = 0,1,2, or 3, e.g. p4 = 5 Vin, OH AW = sigda" A dx/ for f = 2. 

The Lagrangian density is a 4-form: 


L=L(o* de). (42) 


Note that there is no explicit appearance of the coordinates x’ or the coordinate 
partials 0;; dp“ is an (f + 1)-form which geometrically includes partial derivatives 
of the components of y4, but only in an antisymmetric combination. (Here, we 
explicitly consider just one f-form field. The generalization to include several fields 
of different grades, is straightforward.) 

Our convention is to vary fields off to the left (other conventions would differ 
only by some signs). The variation of £ is thus 


6L = ddy4 A sac + dp4 A 


(43) 
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This implicitly defines OL/Ody4 as a (3 — f)-form and 0L/dp4 as a (4 — f)-form. 
Next, interchange the order (i.e. 6d = dd) to get 
a, OL 

dyA 


iE OL OL 


(Upon integration over some spacetime region this last step is “integration by 
parts,” with the total differential term becoming a boundary term). From the above, 
it follows that the basic variational relation 


o£ 
dL = d(5p* A dp4 A 45 
(dp* A pa) + dy it (45) 
can be regarded as implicitly defining the conjugate field momentum and the Euler— 
Lagrange variational derivative, which have the respective explicit definitions 


PA = adp*’ (46) 
OL OL OL 
ip ya (sacs) ol 


A small price for using form fields is the appearance of occasional sign factors like 
Ct (-1)/ . 


10.1. Hamilton’s principle 


Our first application of (45) is Hamilton’s principle (the principle of least action). 
Let the action within a region U be given by S := f,, £. Then 


o£ 
=| dy Apats dp A pa. (48) 
dy OU 


Now require the action S to be extreme (i.e. 6S = 0) for dy“ vanishing on the 
boundary of U. This yields the field equation d£/dy4 = 0. 


10.2. Compact representation 


For a compact general discussion, it is convenient to suppress the field component 
index.®? (This could be represented in matrix notation; our basic fields y and their 
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differential could be regarded as row vectors.) The Lagrangian then has the form 
L= L(y, dy) and the variational scheme proceeds as 


OL OL 
bL = dip A ae are 
OL OL OL 
=d| dpa + op A d , 49 
(% oe . | ° Ge 7 
hence the key Lagrangian variational identity takes the form 
0) 
bL=d(dp A p)+opA i (50) 


In this succinct alternative to (45), the conjugate momentum and the variational 
derivative can be regarded as form-valued column vector fields. 


11. Some Simple Examples of the Noether Theorems 


Here, we present simple examples of Noether’s two theorems.?! Later, we shall use 
the same types of arguments in more complicated situations. 


11.1. Noether’s first theorem: Energy—momentum 


Further applications of the basic variational identity (45) or (50) yield the Noether 
theorems. Their applications to physical systems are discussed in many works, e.g. 
Konopleva and Popov.” Here, we introduce our particular use of them using two 
specific important cases. 

Noether’s first theorem states that for a constant parameter symmetry, there is 
a conserved current. 

For our concerns the most important example is the conservation of energy— 
momentum. As our specific relevant simple case exemplifying the argument, we 
specialize in this subsection to Minkowski spacetime, which is homogeneous and 
thus naturally has a geometric symmetry under translations. Dynamically let us 
assume symmetry also of the action under constant translations. The symmetry 
depends on a continuous parameter; it is sufficient to consider the infinitesimal 
case. Geometrically an infinitesimal translation corresponds to a constant vector 
field Z. Under such a transformation, the change in the components of form fields 
is given by the Lie derivative. We have: 


Ay = —£z9 = —(diz +izd)y, (51) 
AL =-£7L=—-dizl. (52) 
Equation (50) under these specific variations (i.e. replacing the general 6 by these 
specific changes) should be an identity (since the Lagrangian £ depends on the 
position only through the fields vy). Rearranging leads to 
6£L 


AL — d(Ag A p) = Aga iy (53) 
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Because of (52) the l.h.s. of Eq. (53) is a total differential — a total differential 
which moreover vanishes if the Euler-Lagrange field equations are imposed: 


. dL 
d(-izlL+£zp Ap) =—-L£2pA ie (54) 
This identifies a conserved current density (a 3-form with vanishing differential on 
shell, i.e. when the field equations are satisfied) called the generalized canonical 


stress energy-momentum density (3-form): 
T(Z) = izgL—L£zp Np. (55) 
For a zero-form field, it takes the special shape T®,,Z" nq where (with L = *L) 


OL 


arr — onl re uP DD, 


(56) 


which is the well-known expression mentioned earlier (2). 


11.2. Noether’s second theorem: Gauge fields 


Note: For the rest of the present section, there is no need to restrict the spacetime 
geometry in any way. Our considerations apply quite generally. 
Now, we want to consider invariance under local gauge transformations: 


Ag =a? pTp, (57) 


where the a? are position dependent parameters and the 7, are the matrix gener- 
ators (in our representation the fields are on the left and the matrix on the right) 
of the gauge group. Replace dy by the gauge covariant differential: 


De :=dp+ A? AvThp, (58) 


containing a certain compensating field, the gauge vector potential (a.k.a. the gauge 
connection one-form): AP = A?;dx), which has a special nonhomogeneous gauge 
transformation: 


KAP = —DoP = —(da? = APO? ge’), (59) 
where C?,, are the gauge group structure constants: [T,, T;] = C?grT,. Then 
A(D¢) := (AD) yp + DAy 
= —(Da?) \ eT, + D(a? pT) 
= a? (Dy)Ty. (60) 


Thus Dy transforms just like y. 
Rather than starting with a Lagrangian 4-form of the type L = L(y, dy, 
AP, dA?) and then discovering that the variables A?,dy,dA? can only appear in 
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nice covariant combinations, let us proceed more covariantly, beginning with the 
Lagrangian 4-form 


L=L(y, Dp, A’, FP), (61) 
where F*? is the field strength or gauge curvature 2-form: 
1 
FP := dA? + gO ar A? \ A”. (62) 


Since A? is still allowed to appear independently in (61), there is no loss of 
generality. 
The variation of £ (61) is 


OL 
OD 


OL OL oL 
| P | | P 
+ 5A am) aan + 5A \ Tap’ (63) 


s£=d(deA 


Now we are set for an example of Noether’s 2nd theorem — that for each local 
invariance, there is a differential identity. 

Assume that £ (61) is invariant under the special changes Ay, AA of Eqs. (57) 
and (59). From (63) we then have the identity 


OL OL OL bL 
=d( oP yl, A —— — Do? A) + oF AT, — Da? A. 4 
0 a (are, nr a Nom) Haren Pp OPN (64) 


The second term in the parenthesis may be rewritten as —d(a? oe) + a? D ome 
then using d? = 0 gives 


OL OL OL OL 
= P L oP Pp 
o=dla (om Do" D = aly As Da As = (65) 


aL aL o£ 
= Pp | 
— (om, ‘apg +? are om) 


+a? [> (ot ee Ds + pT; z| 


aDe | aFP Pp” 5p om 


For a local symmetry the quantities a? and Da? are pointwise independent; 
their coefficients must vanish separately. The coefficient of a? identifies a Noether I 
type conserved current: 


OL OL 
= oT, A —_ + D—— 
Jp = gTy \ aDe De (67) 
which satisfies the “conservation” law 
o£ 
DJp = —pTp A ie (68) 


The r.h.s. vanishes “on shell” (i.e. when the field equations are satisfied). 
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From the coefficient of Da?, we obtain an algebraic identity relating the 
Noether I current to a variational derivative: 


o£ 
thereby the Noether I current conservation becomes a differential identity 
ye OL 
D = —yT, A (70) 


bp” 
between the variational derivatives. Note that to obtain these results, there is no 
need for the explicit form of the field equations. 

Another way to argue is to replace the last term in (65) with a total differential 
minus a compensating term, bringing that relation into the form 


OL OL OL dL b£ 
= Pp N L oP 
o=d|a (om A 3 a | are al + a (om Se +d; =): (71) 


If one integrates this over any region, the total differential term gives rise to an 
integral over the boundary. To have a vanishing value for all possible gauge param- 
eters with small support, the coefficient of the gauge parameter everywhere within 
the region and the coefficient of the gauge parameter everywhere on the boundary 


must both vanish identically. This again yields (70) and 
OL OL o£ 
de IN + D =0, 72 
~ Pp ODyp q 0 ( ) 


which is equivalent to (68) with (69). 


11.3. Field equations with local gauge theory 


It should be noted that the Noether invariance argument yields the differential 
identities just found involving the Euler-Lagrange expressions without any need to 
have the explicit form of the Euler-Lagrange expressions. Of course if one explicitly 
computes the Euler-Lagrange expressions, one could go on to verify these identities 
directly. Furthermore, if one has the Euler-Lagrange expressions, one could (prob- 
ably not so easily) directly discover such identities, even if one was not aware of 
the local symmetry. 
To compute the field equations, the explicit variations 


dbDy = Dip + dA? A vTh, (73) 
OF? = ddA? + CP grA? AOA" = DOA, (74) 
are needed. The variation of the Lagrangian 4-form £ (63) is 
OL OL OL OL 
dL =dDpA + dp A + OF? A + SAP A —— 
"ado '°?" ag aFP @AP 
OL OL OL OL 
= Pp } } p } Pp 
(Déy + 6A? A yT,) A aDe dp A He DOA? A FP dA? A eT 
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OL OL OL OL 
= ——— 4 GAP A pp 
(senses nam) +404 ( S soa t 5c) 
OL OL OL 
Pp | | 
+A? A (ox Sag Tp). (75) 


Comparing the explicit form of 6£/6A? found here with (69) and (67) shows 
that (69) means 


—=0. (76) 


Thus, local gauge invariance means: no explicit dependence on the gauge potential; 
all dependence on A? comes through Dy and F”. Furthermore, if one makes the 
usual minimal coupling assumption, 


L — L(A”, F?) + Lo(y, Dg, A”), (77) 


the identities (76) and (70) apply separately to each term. Hence, the Lagrangian 
4-form must have the simpler form 


L=Lal(F?)+Laly, De), (78) 


and the differential identity (70) becomes the two identities 


LA JL o£ 
D =0, D—=-¢7,A—. 79 
gat Re OO hs re) 
The first relation is explicitly 
OLA OLA 
0 = DP? = FIC gy A —. 80 
OF P ?” OFT oo 
The latter is a kind of gauge current “conservation,” as the r.h.s. vanishes since 
o£ o£ 
—“=—=0 (81) 
dp dy 


on shell. In more detail, this gauge current “conservation” relation has the form 
0 = DJp = dJp — AIC gp A Ir. (82) 


Thus, it has some similarities to the vanishing covariant differential of the material 
energy-momentum. Just as in that case, one can rearrange the field equation to 
obtain a conserved gauge pseudocurrent. 

We have gone into considerable detail in this relatively simple example. We have 
done this to prepare the reader, because we are going to use a very similar argumen- 
tation in connection with the rather more complicated case involving gravity and 
the dynamic spacetime geometric symmetries associated with energy-momentum 
and angular momentum. It will be seen that almost every step used in our later 
argument and every expression has an analogue with what we have done in this 
subsection. 
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12. First-Order Formulation 


In this section, we discuss the general formulation of the first-order formalism; the 
spacetime geometry has no restrictions. 

We proceed from the action principle. Any action principle can be rewritten 
in an equivalent form, which (following, e.g. ADM! and Kuchai”°) we refer to as 
first-order; this is the most convenient form for our purposes. Here, we present 
a simple argument (essentially the same Legendre transform idea as is used in 
classical mechanics to construct the Hamiltonian) which is applicable to a large 
class of second-order Lagrangians. 

Given a second-order Lagrangian 4-form L(y,dy), we define its associated 
canonical momentum in the usual way: 


aL 
p= Dap oe (83) 


Next, we define a 4-form by 
A(p, dy, p) = dp Ap—L. (84) 
Now consider the variation of A: 


5A = 6(dp) AptdpAdp—6L 


OL OL 
= d(dyp) A (o- Fe) + den sp — bea 5 


OL 
= dp A dp — 6g A —. 85 
ee (85) 
Which shows that A can be regarded as a function only of y, p.P 
This construction takes one from the usual second-order Lagrangian to a first- 


order type of variational principle: 


Li (yp, dy,p) = dp Ap — A(y,p), (86) 
where y and p are now regarded as being independent variables and are varied 
independently. Varying (86) gives 

6L'* = ddp Ap+dp Adp— 6A 
OA OA 
Op Op 


= d(ép Np) —sdp Adp+dy A dp — dp A 


6fist List 
dp Op 


6L'st = dip Ap) +dpA A Op. 


PThe procedure becomes technically somewhat more complicated if (83) cannot be inverted for dp 
in terms of y, p. In that case, one must introduce some additional variables that appear in A only 
algebraically and thus function as Lagrange multipliers introducing some algebraic constraints. 
We will not go into such complications in our general development here. Later in our treatment of 
Einstein’s GR, we will see a concrete example. Examples of how this has been dealt with in field 
theory in practice can be found in Refs. 46, 128 and 129. 
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(Note: We find it more convenient to vary our momentum fields p off to the right. 
This reduces a little the number of appearances of the sign factor ¢, and merely 
amounts to a sign convention on the definition of OA/Op.) 
Using independent p and y variations gives a pair of first-order field equations 
for the differentials of the fields: 
oc OA én OA 


50 sap ay’ dp p 


(88) 


13. The Hamiltonian and the 3+ 1 Spacetime Split 


Here, we introduce the Hamiltonian and the spacetime split. In this introduc- 
tory subsection, we use for motivation some well-known elementary expressions in 
Minkowski spacetime. In the subsequent subsections, the spacetime geometry is quite 
general. 

A key feature of the canonical Hamiltonian formulation is that the field equa- 
tions are decomposed into two sets: The initial value constraint equations and 
the dynamic equations. A familiar example which illustrates many of the ideas is 
Maxwell’s vacuum electrodynamics.4 The 4-covariant equations were given earlier 
(39) and (41): d* F = ZJ,dF = 0, or in tensor index form 


Onl V—gF"") = ZoV/—-gJ", aK) = 0. (89) 


They split (in Minkowski spacetime) into the familiar initial value constraints (spa- 
tial projections, with no time derivatives): 


Vino’. VeBe, (90) 
€0 


and the time projections, a pair of dynamic equations: 
B+VxE=0, VxB=,0J+E, (91) 


which contain the first time derivatives of the dynamical fields linearly. 

The canonical Hamiltonian form of these equations is in terms of the 4-vector 
potential (which satisfies F = dA and splits into the scalar and vector potential). 
The familiar vector form is 


constraint V-E= ay (92) 
€0 


dynamic A=—-E-V®, -E=-Vx (Vx A)+ pi. (93) 


The scalar potential field appears here, but it has no evolution equation; it can be 
chosen freely. This gauge freedom affects the evolution of an “unphysical” part of 


4Zo is the vacuum impedance, €9 = (Zoc)—1 is the vacuum permittivity, and uo = Zoc—! is the 
vacuum permeability. Here, we are taking for simplicity geo = c~? = 1 in relativistic spacetime 
units. 
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the vector potential. Considering this along with the constraint on E (92), one finds 
that the electromagnetic field has two physical degrees of freedom. 


13.1. Canonical Hamiltonian formalism 


The canonical Hamiltonian formalism® is of interest because it clearly reveals 

the constraints, gauges, and degrees of freedom, as well as the total energy— 

momentum — and it offers a practical way to numerically calculate solutions. 
The dynamical theories of interest all have constraints. The canonical formal- 


29-31 and 
128,129 


ism for constrained Hamiltonian systems was developed mainly by Dirac 
Bergmann.*° For a general discussion see, e.g. Hanson et al.,*” Sundermeyer. 
Rosenfeld!°? seems to have been the first to consider a Hamiltonian approach to 
GR, but this early work was not followed up. As far as we know Pirani et al.1°? were 
the next to address the issue. Dirac gave a rather complete treatment in 1958.29°° 
The treatment by Arnowitt Deser & Misner (ADM)! has come to be regarded 
as the standard. For a basic discussion see MTW Ref 82, Chap. 21 or Isenberg 
and Nester.°® For some critical comparison, see Kiriushcheva and Kuzmin.’° Going 
beyond Einstein’s theory, a remarkable “if constraint” formalism was developed for 
the PG by Blagojevié and Nikolié® to deal with a conditionally degenerate kinetic 
Hessian. They use the Dirac type of approach; so to construct the Hamiltonian 
one must first find the primary constraints —- which depend on the conditional 
degeneracies of the Legendre transformation. The “if constraint” technique is a 
marvelous way to manage the technicalities involved in constructing the Hamilto- 
nian. In our first-order approach, in contrast, one can readily formally construct the 
Hamiltonian and the Hamiltonian equations, however (in line with the principle of 
the “conservation of difficulties”) a suitably adapted version of the “if” constraint 
technique will still be needed when one actually tries to solve the dynamical and 
constraint equations. The first-order approach as used in the covariant Hamiltonian 
formalism allows one to investigate the general formalism and, in particular, to find 
covariant expressions for the “conserved” quantities while postponing dealing with 
such technical details. 


13.2. The differential form of the spacetime decomposition 


Note: For the rest of this section, the spacetime geometry is quite general. 


A feature of this standard approach is the loss of manifest 4-covariance. Now a 
Hamiltonian formulation essentially requires that the time derivatives to be singled 
out from the spatial derivatives, so in this sense it cannot be truly 4-covariant. 
The usual approach, however, departs far more from 4-covariance than is neces- 
sary, all the indices are 3 + 1 projected, leading to much extra bookkeeping. In 
the ADM approach, the spacetime metric is replaced by the spatial metric and 
the lapse and shift. However only the derivatives 0, really need to be projected. 
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Since interaction fields are one-form fields, this means decomposing the exterior 
differential d, decomposing d will inevitably involve decomposing the differential 
form. One of the reasons for using differential forms is in how nicely they decompose 
in this fashion. 

Begin from the basic first-order form Lagrangian (86). Its variation (88) iden- 
tifies the first-order Euler-Lagrange expressions (88). According to Hamilton’s 
principle, the first-order Euler-Lagrange expressions should vanish. This gives us 
our first-order field equations. 

We want to extract the “time derivative” of p and y, i.e. the change with respect 
to an evolution parameter (which we refer to as time) as seen by observers who 
move along some fixed congruence of worldlines. This change is given by the Lie 
derivative in the direction of the (fixed) vector field Z tangent to the congruence: 
ie. O; := £7. The Lie derivative on the components of differential forms is given 
by a simple neat expression: £z = diz +izd. Using this, from the differential we 
can extract the “time” derivative: 


izd@ = £20 — dizB = B — diz8. (94) 


(The congruence need not actually be timelike. Indeed, what we are doing here does 
not require a metric tensor. Even when one has a metric whether the vector field is 
“timelike” is not an important issue, our whole Hamiltonian formalism is linear in 
the spacetime displacement vector field Z, so by considering the difference between 
two timelike displacements, one could get a spacelike displacement. A metrically 
timelike displacement is important when one actually tries to find a physical solution 
to the equations; for evolution one wants hyperbolic equations). 

The description of “time” also includes the idea of “instants of time.” Geomet- 
rically this is a set (foliation) of (nonintersecting) 3D hypersurfaces (we usually 
think of them as being spacelike). Locally, we can always choose adapted coordi- 
nates: c4¥ = Tig. where k = 1,2,3 so that the spacelike hypersurfaces are ); 
with ¢ = constant. With respect to these adapted coordinates, Z is the directional 
derivative in the time direction: Z = 0;. Note that izdt = 1. 

From these considerations, we are led to define the “time” and “space” projec- 
tions of differential forms. We use the notations 


Q:=izga, a:=a-—dtAa, (95) 


to indicate the “time” component and the “spatial” part of a form. These projec- 
tions have simple expressions in terms of adapted coordinates." 
Thus a general form decomposes according to 


a=dt\a+a. (96) 


"We have long used this type of decomposition beginning with®?84; see Ref. 81 for a similar 
technique. 
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In our formalism ¢z and t are thought of as freely (except that izdt = 1) chosen 
covariant fields, then the decomposition of @ is essentially covariant. With this 
notation, the differential decomposes according to da = dt A da + da. From (94) 


da = a — dé; (97) 


thus we can extract the part with the time derivative. 

It is convenient to decompose the differential operator d itself. In view of the 
adapted coordinate expression d = dx“ A 0, = dt \ 0; + dx* A Ox, we define the 
decomposition as 


d=dtAd+d, withd:=£z. (98) 


Now we can examine the spacetime decomposition of our first-order field equa- 
tions (88). We first consider the time projections, which include all the time deriva- 


tives: 
e& = -<(6- 4) (2) =0, (99) 
eS oe e = (100) 


they are the dynamic equations for p and y. In order to use these equations to 
evolve p, Y, we generally need to know p and ¢, which normally are provided by 
the initial value constraints: The spatial restriction of (88): 


—) (5) 
= —<¢dp = 0, 101 
( dy — dy en) 


If these two equations can be solved for p and ¢ all is well and good (in that 
case the two equations are, in Bergmann’s terminology, second class constraints). 
They then define p and ¢ for all time as functions which depend on y, p, dp and 
dy. How to proceed for the case where these quantities cannot be found from the 
constraints is best understood from concrete examples. For our purposes of this 
work, the already-discussed Maxwell electrodynamic example is sufficient. In that 
case, there is some undetermined gauge freedom. 


13.3. Spacetime decomposition of the variational formalism 


We decomposed the equations. One could decompose the Lagrangian or its 
variation. Our approach easily relates these alternatives, as can be seen from the 
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following: 


Let=dpAp-A 4 List =(p-dg)Ap—(A+sdpAp), (103) 


a d a D 
SLI = d(5y Ap) ass Boe Pp) — de Ap+<dy Ap) 
1st 1st Spist 
ee ion (=) +69.n6( 
ip “ hibe Po XN bp } 
6List 6 ist . oList 
extract | | extract 
6 ist 341 bList oList (105) 
oy bp oe 7 
sci ain san) (ee (106) 
Sp py py, 


For our objectives here we will not need to use this projection into the space and 
time parts of form expressions very much. Our intention here was to include enough 
of the details so that the reader can have some confidence that this formalism can 
yield a proper Hamiltonian description. From what we have discussed, it can be 
seen that dynamical equations in this first-order covariant form already contain 
both the constraint and dynamical evolution equations. It should be noted that 
the first line of the above set of relations (103) shows how the Hamiltonian can be 
simply extracted from the first-order Lagrangian. 


14. The Hamiltonian and Its Boundary Term 


In this section, we establish some of our main formal results concerning the covari- 
ant Hamiltonian and its boundary term. The geometry is quite general. The energy, 
as well as the other conserved quantities, of a physical system can be identified with 
the value of the Hamiltonian. In particular for a gravitating system, the associated 
Hamiltonian is proportional to the field equations which vanish on-shell. Therefore 
the corresponding conserved quantities are determined by the Hamiltonian bound- 
ary term. The choice of Hamiltonian boundary term is associated with the specific 
boundary condition.19:21-?4 
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14.1. The translational Noether current 


The action should not depend on the particular way points are labeled. Thus 
it should be invariant under diffeomorphisms, in particular, infinitesimal diffeo- 
morphisms — a displacement along some vector field Z. From a gauge theory 
perspective, such displacements are a “local translation.” Under a local transla- 
tion, quantities change according to the Lie derivative. Hence, for a diffeomorphism 
invariant action, the key variational relation (88) should be identically satisfied 
when the variation operator 6 is replaced by the Lie derivative £z (= diz + izd): 
List 6cist 


ip | Op 


dizl'** = £7li* = d(£zp Ap) + £zp \ /\ £zp. (107) 

This simply means that £1'** is a 4-form which depends on position only through 

the fields y, p. (According to our understanding this is only possible if the set of 

fields in £1S* includes some dynamic spacetime geometric variables: gravity.) 
From (107) it directly follows that the 3-form 


H(Z) := £zp Ap—izl", (108) 
satisfies the identity 


6fist 6fist 
op z op 


—dH(Z) =£zpA AN £ zp; (109) 


thus it is a conserved “current” on shell (i.e. when the field equations are satisfied). 
Substituting (86) into (108) gives the explicit expression 
H(Z) =d(ize A p)+sizyg Adp+cdp Aizgpt+izA, (110) 


thus this conserved Noether translation current can be written as a 3-form linear 
in the displacement vector plus a total differential: 


H(Z) =: Z*H, + dB(Z). (111) 
Compare the differential of this expression, dH(Z) = dZ"AH,+2Z"dH,,, with (109); 
equating the dZ" coefficient on both sides reveals that 
List 6fist 


ZH, =—izwn Aizp. 112 
Hy, =—-izy jo a (112) 


Thus, as a consequence of local diffeomorphism invariance, 1, vanishes on shell; 
hence conservation of the translational Noether current (109) reduces to a differ- 
ential identity between Euler-Lagrange expressions. This an instance of Noether’s 
second theorem, and, moreover, it is exactly the sort of case to which Hilbert’s 
remark regarding the lack of a proper energy law applies. 

From the above it follows quite generally, as we remarked earlier in the special 
case of GR (32), the value of the conserved quantity, —P(Z,V), associated with 
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a 3D region V is determined by a 2-surface integral over the boundary, i.e. it is 
quasi-local: 


—P(Z,V) = f MZ) = p BIZ). (113) 


For any choice of Z this expression defines a conserved quasi-local quantity. What 
do these values mean? As we shall see in detail later, for a suitable timelike (space- 
like) quasi-translation displacement on the boundary the expression defines a quasi- 
local energy (momentum), and for a suitable quasi-rotation (boost) it defines a 
quasi-local angular momentum (CoMM). However it must be noted that, like all 
other conserved currents, the translational current is likewise subject to the usual 
ambiguity: one can add by hand the differential of any 2-form and still have a con- 
served current. But that amounts to being able to adjust 6 freely, consequently 
one could obtain almost any quasi-local value. The Hamiltonian perspective brings 
this freedom under physical control. As we shall show, the first-order translational 
current 3-form is something more: it is the generator of local diffeomorphisms, i.e. 
the Hamiltonian. 


14.2. The Hamiltonian formulation 


From the first-order field equations (88), by contraction with a “time evolution 
vector field” Z, we get a pair of Hamiltonian-like evolution equations for the “time 
derivatives”: £zy, £zp. A key identity involving these time derivatives is revealed 
by comparing two relations. Consider the projection of the Lagrangian 4-form 
izl'*, which from (108) is just £zy~ A p — H(Z); its variation is 


dizl'* = 6(£z~ Ap) — 5H(Z) 
=0(£z~)\p+£zyp A dp — dH(Z) 
=f£zdp\pt+£zp A dp — dH(Z) 


=Lz(dpAp)-—dgpAN£Lzpt+£zp A dp—dH(Z). (114) 
Compare this with the projection of J£1* (88) along Z: 
1st 1st 
izdL'* = izd(dy Ap) +iz (| dpA a + ae A 6p 
dy op 


6List 6List 
oy ary 


= L£z(dp Ap) —diz(6p Ap) t+iz (sen N40). (115) 


Since Z is not varied, the two relations are identical: dizl'* = izdL'**; conse- 
quently, 


+ —— A dp 


6List 6List 
dy dp ) 


O6H(Z) =—-dp AN £zpt+£zp A dp diz(dy A p) —iz (s¢ A 
(116) 
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The last term vanishes “on shell.” This relation identifies the Noether translational 
current H(Z) as the Hamiltonian 3-form (i.e. density), as the following considera- 
tions show. The integral of H(Z) over a 3D region, 


HZ. 3) = [m. (117) 


is the Hamiltonian which displaces this region along Z, since the integral of its 
variation: 


5H(Z,=) = i 5H(Z), (118) 
yu 


yields, from (116), “on shell” (then the last bracketed term vanishes) the Hamilton 
equations: 


5H(Z,%) _ 6H(Z,™) 


if the boundary term in the variation of the Hamiltonian vanishes. In this case 


Lz) = 


that means when dy vanishes on OX. Technically, the variational derivatives of the 
Hamiltonian H(Z,%) displayed in (119) are only defined for variations satisfying 
this boundary condition. In other words, this Hamiltonian is “well-defined,” i.e. 
functionally differentiable, only on the phase space of fields satisfying the particular 
boundary condition dy|ay = 0. 


14.3. Boundary terms: The boundary condition and reference 


In some important cases, the fields of physical interest do not satisfy the boundary 
condition naturally inherited from the Lagrangian,”! this happens in particular for 
the spacetime metric of an asymptotically flat region. A modified formulation is 
needed to deal with this. 

One alternative is to modify the Lagrangian 4-form itself by a total differential. 
This strategy has often been adopted, beginning with Einstein (9) and including 
many of the Hamiltonian formulations.13°10? But such a modification is necessar- 
ily noncovariant. For our formalism, we want to keep our Lagrangian covariant. 
Furthermore a Lagrangian boundary term would modify the boundary condition 
on the whole 3D boundary of the spacetime region, thus inducing the same type 
of modification on the spatial boundary at large spatial distances as on the initial 
time hypersurface. However we want the freedom to adjust the boundary condition 
on the 2D boundary of the spacelike region 0% independently of the type of initial 
conditions imposed within the initial time hypersurface /,. Thus for our objectives 
we turn to the Hamiltonian boundary term.® 

Note that the Hamiltonian (111) has two distinct parts; each plays a distinct 
role. The proper density Z“7,,, although it has vanishing value on shell, generates 


SIn the end, it turns out that our favored Hamiltonian boundary term for GR is related to one 
induced by a Lagrangian boundary term.?! 
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the equations of motion, whereas the boundary term B(Z) determines not only the 
quasi-local value (113) but also the boundary condition. Now we should make note 
of a very important fact: The boundary term can be adjusted — without changing 
the Hamilton equations or the conservation property (109). Thus one can replace 
the 2-form B(Z) = izy A p inherited from the Lagrangian by another. 

Such an adjustment is in one respect just a special case of the conserved Noether 
current ambiguity (i.e. for any 2-form x, J and J’ := J + dy are both conserved 
currents (3-forms) if dJ = 0, even though they define different conserved values). 

However here, in this Hamiltonian case, any such adjustment modifies — in par- 
allel — not only the value of the quasi-local quantities but also the spatial boundary 
conditions. Thus the boundary term ambiguity is under physical control: Each dis- 
tinct choice of the quasi-local expression given by the Hamiltonian boundary term 
is associated with a physically distinct boundary condition. 

In order to accommodate suitable boundary conditions we found that, in general, 
one needs to introduce on the boundary for each of the dynamical fields certain 
reference values p, ¢, which represent the ground state of the field — the “vacuum” 
(or background field) values. This is necessary in particular for fields whose natural 
ground state is nonvanishing; the spacetime metric is such a field. 

We take our boundary terms to be linear in Ay := y — ¢, Ap := p— p, so 
that they (and thus all the quasi-local quantities) vanish if the fields take on the 
ground state (reference) values.' We presume that the reference values (like Z) are 
not varied: 6g = 0 and 6p = 0, consequently dAy = dy, dAp = dp. 


14.4. Covariant-symplectic Hamiltonian boundary terms 


To find an improved Hamiltonian boundary term for (110) first drop the one inher- 
ited from the Lagrangian, examine the boundary term generated in the variation of 
the 3-form part of the Hamiltonian (110); it is -izp A dp +<¢dy Aizp. This invites 


us to add a suitable complimentary boundary term. In this way, we were led to the 
.21,23,24 


boundary terms 
B(Z) :=iz {i} nap saete ‘rh, (120) 
P 


p 


Then the associated variational Hamiltonian boundary term becomes 


izdy \ Ap —Agy Aizdp 
dH(Z) ~d +¢ 
—izAy A dp dy AizAp 
Here, for each bracket independently one may choose either the upper or lower 
term, which represent essentially a choice of Dirichlet (fixed field) or Neumann 


, (121) 


tSome authors use the terminology regularize. 
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(fixed momentum) boundary conditions for the space and time parts of the fields 
separately.” 

In each of these cases, the boundary term in the Hamiltonian variation has a 
certain symplectic structure which pairs certain control quantities — i.e. the inde- 
pendent variables — with certain associated response quantities — the dependent 
variables. (For discussions of this paradigm of the symplectic structure associated 
with variational principles which we have found to be illuminating see Refs. 68 and 
69). The symmetry of the above expressions under an interchange of “control” and 
“response,” formally 6 — A, A — —6, is noteworthy. 

Thus, although it is not so well known, when the issue is examined one can 
readily see that there are many choices of boundary conditions and consequently 
really many different expressions for energy in classical field theory. This is true 
especially for gravitating fields. Actually this sort of thing is not unusual in 
physics; in particular one can compare the situation with that in thermodynamics 
(which has several physically meaningful energies: internal, enthalpy, Gibbs and 
Helmholtz). 

Nevertheless, it should be noted that one of our boundary term expressions 
stands out: For any field which allows trivial reference values, 6 = 0 = jp, one 
boundary term choice vanishes (the lower choice in each bracket). Such fields, with 
this choice of boundary condition, make no explicit contribution to the quasi-local 
boundary term. This particular boundary term has another virtue: For any field 
with gauge freedom, it is the only gauge invariant choice. Thus there is a cer- 
tain preferred boundary expression — and thus a preferred boundary condition — 
for this large class of fields, a class which includes all the physical fields of the 
standard model. There is, however, a quite important exception: Gravity, more 
specifically, any gravity theory formulated in terms of dynamic spacetime geometry 
which includes the spacetime metric as a dynamical field. The natural reference 
choice for the metric is not a vanishing metric tensor but rather the nonvanishing 
Minkowski metric. Consequently one must have, in general, a nonvanishing Hamil- 
tonian boundary term. 


15. Standard Asymptotics 


This section is concerned with suitable asymptotic conditions for our classical fields 
at spatial and null infinity. It also includes a discussion of energy flux. 

For spatial infinity, the issue of asymptotic conditions was first investigated in 
GR by Regge and Teitelboim,!° with later refinements by Beig and O Murchadha? 
and then Szabados.!%?:!34 We have developed a similar idea for general fields. 


“There are more complicated possibilities, “mixed” choices involving some linear combination of 
the upper and lower expressions.!!9 We do not have any specific physical examples, but mixed 
boundary conditions may be of interest in certain cases. 
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15.1. Spatial infinity 


For finite regions, these boundary terms in the variation of the Hamiltonian tell 
us exactly what needs to be held fixed (i.e. “controlled” ). For asymptotically flat 
regions, however, one should take into account the asymptotic fall off rates. The 
various boundary terms we have constructed enable the Hamiltonian to be well 
defined on the phase space of fields with suitable asymptotic behavior for all typical 
physical fields. 

For the fields, it is sufficient’ to take the respective asymptotic fall offs for even 
and odd parity terms to be 


1 1 1 1 


Parity here means the parity of the components in an asymptotically Cartesian 
reference frame. The 2-surface area element has odd parity, so even parity 2-forms 
automatically have vanishing 2-surface integral. 

For asymptotically flat spaces, the displacement should asymptotically be a 
Minkowski Killing vector, i.e. an infinitesimal Poincaré displacement. It is sufficient 
to take 


ZY we ZY + Mya’ + OF (=) + O- (1), (123) 


where (in terms of asymptotically Minkowski coordinates) Z}' is a constant trans- 
lation parameter and \4” = dH“) is a constant asymptotic infinitesimal Lorentz 
boost/rotation parameter. 

With the asymptotics (122) and (123), it is straightforward to check that for any 
of the boundary term choices (120) all of the quasi-local quantities have finite values; 
furthermore, for any of the choices all of our Hamiltonians are differentiable on the 
specified phase space — since the respective boundary terms in the variations of the 
Hamiltonians (121) vanish asymptotically. Thus our Hamiltonians are generally well 
defined on a large phase space which includes physically interesting solutions. At 
asymptotically-flat spatial infinity, the aforementioned asymptotics are physically 
reasonable. Our considerations naturally straightforwardly extend to asymptotically 
anti-de Sitter spaces. Here the details are omitted. 


15.2. Null infinity 


Let us next consider what can be expected if the boundary of our 2-surface 0% 
approaches future null infinity. Long range radiation fields (e.g. electromagnetism) 


YSufficient, but not necessary. When examined in detail it can be seen that one really only needs 
conditions on certain combinations of the components, but it is not in the spirit of our treatment 
to break fields up into, e.g. components parallel to and perpendicular to some specific boundary 
surface, etc. Here we are satisfied with a formalism that includes a large class of fields. We leave to 
more specific investigations finding the largest acceptable phase space with the weakest conditions. 
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have slower fall offs, like Ap ~ dy = O(1/r). Then the boundary terms in the 
variation of our various Hamiltonians will not vanish, so the Hamiltonian is no 
longer functionally differentiable. This seeming calamity is actually providential — 
it is directing us to additional physics contained within the formalism, namely 


energy flux expressions.24+84151 


15.3. Energy flux 


For the flux of “energy” there is a special way of calculating — the analog of the 
classical mechanics calculation (for conservative Hamiltonian systems) of 


5H = G*dpx — prog’ > E := H =0, (124) 


under the replacement 6 — d/dt, where the remarkable cancelation is a consequence 
of the particular (symplectic) form of the Hamiltonian variation. In the present case 
from (116) under the replacement 6 — £z the same type of symplectic calculation 
occurs, and we are left with the respective contributions from our various boundary 
term choices (121) 


(125) 


£xn(z) =al{ iz£zp \ Ap \ ea 


—izAypAL£zp Lz \izAp 


(We are presuming that £79, £zp vanish). 

In particular, as we mentioned earlier, for all fields with vanishing reference, 
there is a standard choice of Hamiltonian boundary term, namely the one that 
vanishes. The corresponding energy flux expression is 


£ZH(Z) =d(-izpA £zp+slz Aizp). (126) 


16. Application to Electromagnetism 


To illustrate these ideas in a familiar setting, we briefly consider vacuum electro- 
magnetism in Minkowski space (for the complete details see Ref. 24). 

For electromagnetism in Minkowski space, the formalism developed above, with 
the important exception of the “on shell” vanishing of H,,, is still applicable. A first- 
order Lagrangian 4-form for the (source free) Maxwell one-form (the U(1) gauge 
potential) A is 


, 1 
List, = dANP — 5 20* PAP, (127) 

which yields the pair of first-order equations 
dP =0, dA—Zx*xP=0. (128) 


These are just the vacuum Maxwell equations with *«P = Z)'F := Zp ‘dA; hence 
P=-LZy ly F, andd*F =0. (Here Zp is the vacuum impedance, which has the 
value [io = Eo ! in our relativistic units in which c = 1. With our conventions, our 
conjugate momentum field P turns out to be the negative of H which was introduced 


J-226 C.-M. Chen, J. M. Nester and R.-S. Tung 


earlier in (39)). With Z = OQ; and the decomposition A = (—@, Ax), we find that 
igF =izdA = £zA-—dizA corresponds to Fox = Ay + Ox¢ = —E,. The magnetic 
field strength is Fi; := 0;A; — 0; Aj =: €ijk BY. Hence, Po; = —Z," * Fo; = —p Bi, 
and Pi; = —Ao * Fig = —E06ijn BE. The natural reference choice is A = 0 = P. 

The Hamiltonian 3-form is 


1 
HEM (Z) = -izAdP —dANizP+iz (5% +P A P) age. (129) 


In the usual tensor index notation, the volume density part has the form 


1 ie 1, 
HEM = bOnn® + 5 (Oi As = 0; A;)e" Hi, Rap Hos H* Hn, (130) 


2€0 n 
where the momentum conjugate to the 3-vector potential is given the name x* (it 
works out to have in the usual terminology the value —e9E*, i.e. —D*). By varying 
H® one obtains pioH* = seh (0;A; — 0; Ai) = B*, a 2nd class constraint that could 
be used to eliminate the magnetic field, then the Hamiltonian volume density would 
correspond to the familiar energy density 5 (oH? + pio B?) plus a gauge generating 
term, ¢0,7", which vanishes “on shell”; the scalar potential in this term acts as a 
Lagrange multiplier to enforce the (first class) Gauss constraint 0, D* = 0. 

Let us just consider two boundary term choices, namely our preferred choice with 
vanishing boundary term, and the above Hamiltonian 3-form with the boundary 
term 


BEM — iZAP = —on*dSy. (131) 


These two are actually both well known physically, the former corresponds to the 
energy density from the gauge invariant energy-momentum tensor (8), and the 
latter is the energy density of the electromagnetic canonical energy-momentum 
tensor (7). Here, our interest is not in the field equations but in the total differential 
term which, upon integration, becomes a boundary term indicating the boundary 
condition. Briefly, for the choice with vanishing Hamiltonian boundary term, the 
total derivative term in the variation of the Hamiltonian is 


—d(izA6P + 5A NizP) ~ On(pdr*® — 5A; H;), (132) 


which tells us that one should hold fixed on the boundary of the dynamical region 
the normal component of the electric field and the surface parallel components of 
the vector potential (the gauge independent part of which determines the normal 
component of the magnetic field). On the other hand, for the Hamiltonian including 
the boundary term (131), one finds that the total differential in the variation of the 
Hamiltonian is now 


d(iz5AP — 5A NizP) ~ —Op(6bn* + &5A;H;). (133) 


This is the same boundary condition in the vector potential/magnetic sector but 
now for the electric sector, one should instead hold fixed on the boundary of the 
region the scalar potential. 
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The physical meaning of such boundary conditions are well known. Fixing the 
normal component of the electric field on the boundary corresponds to fixing the 
surface charge density. An instructive physical example is a parallel plate capac- 
itor. One can use a battery to charge up a capacitor with a moveable dielectric. 
Disconnect the battery and measure the work needed to remove/insert the dielectric 
(the potential varies but the charge is fixed, no current or power flow). Alterna- 
tively leave the battery connected and measure the work needed to displace the 
dielectric — now the potential is fixed but the charge varies, so current and hence 
power flows. The respective boundary terms in the variation of the Hamiltonian are 
don* dS; and —d¢x*dS;,. Both boundary condition choices are physically meaning- 
ful corresponding to real situations. 

Nevertheless for electrodynamics one expression stands out, the one with van- 
ishing boundary term. This choice is the only one in which the value of the Hamilto- 
nian is gauge invariant. Moreover, this is the only non-negative Hamiltonian density. 
Consequently the associated energy has a lower bound and the system has a natural 
vacuum or ground state: zero energy for vanishing fields. The value of the Hamil- 
tonian with this boundary term can be interpreted as the internal energy, whereas 
the other expressions can be regarded as including some additional energy on the 
boundary of the system associated with maintaining the boundary condition. The 
associated electromagnetic energy flux expression from our formula (126) reduces 
to just the usual Poynting energy flux: 


£7H™ = d[-izAA (diz + izgd)P — (diz +izd)A A izP| 


= —d(igk A izP) 
= d(—E;H; dz‘ A dz’). (134) 


Clearly this choice, associated with fixing the normal components of the electric 
and magnetic fields on the boundary, is preferred; it is the one suitable for most 
physical applications. It gives the usual energy density and Poynting energy flux. 
Similarly, for all other fields — except for dynamic spacetime geometry — there is 
available a standard Hamiltonian (the one with vanishing boundary term contribu- 
tion) associated with a certain preferred boundary condition. 


17. Geometry: Covariant Differential Formulation 


In the discussion of our covariant Hamiltonian approach, up to this point (except 
as was specified for a couple of specific examples), there has been no need to make 
any restriction on the type of geometry for our manifold. Here in this section we 
discuss the specific sort of dynamic spacetime geometry that we will consider and 
relate it to the gauge theory paradigm. 

The covariant Hamiltonian formulation can apply to general theories of dynam- 
ical geometry. Standard references for differential geometry are Kobayashi and 
Nomizu” and Spivak.!?° 
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17.1. Metric and connection 


For the dynamical spacetimes that we consider, there are two basic geometric ideas: 
a metric tensor g = guy0" ® V0” (which determines length and angle), and paral- 
lel, associated with parallel transport, covariant derivative and connection, i.e. we 
consider metric-affine geometry. 

The metric gives the causal structure, arc length, area and volume. Furthermore 
it is used to raise and lower indexes (i.e. it determines a specific isomorphism 
between tangent and cotangent vectors). It also provides the paths of extremal 
length (geodesics). For the 4D spacetimes of interest here, the metric has the Lorentz 
signature. Given such a metric, there is a naturally defined associated symmetry 
group, the group of local Lorentz transformations: L € SO(1,3) > g(LxX,LY) = 
(X,Y). 

For the other structure, let e, for 4 = 0,1,2,3 be a basis for spacetime vector 
fields. The covariant differential V of each basis vector is a vector valued one-form, 
hence some linear combination of the eg’s with one-form coefficients: 


Veo=e,0% 5, Tegel gay, (135) 


called the connection one-forms. 
The covariant differential of a vector field V = e,, V" is 


VV =V(enV") = (Ven)V" +e,VV" = eT’ VY +e,dV" =: e,DV". (136) 
Its components are determined by the operator D: 
DV® :=daV#+TRL AV”, (137) 


which extends, as indicated, to vector valued forms. 
The notation automatically antisymmetrizes: 


VV eV EY H=Veor =e,0 " 
=e, [d(dV" +I", AV’) +1") A (dV* +1%, AV’)| 
=8,(dI™, FI AI) AV" =e, Rp AV’, (138) 
where 
MSW ry Ar eS SRM isda! A dx. (139) 


is the curvature 2-form 

Exterior covariant differential form notation treats some, but not all, indices as 
differential forms. Rather than work with the operator V on geometric objects we 
often find it more convenient to work with D on their coefficients. The covariant 
differential D can be extended to operate on a tensor valued form of any type, e.g. 


DP%g = dP%g+T%, A P% —I%g A Py, (140) 
D?P* 9 = Ry A Pl p= Rg A Py. (141) 
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Generically, arranging the components of a tensor valued form y as a row vector 
we get 


Dy = dy +T%g A yoo", (142) 
Dg = Rp Agog”; (143) 


for some appropriate representation matrix 74°. 
The special case of the vector whose components are the coframe one-forms 
J* = e%;,dx' yields the torsion 2-form: 


1 ; 
T® := Dd := dv° +T%g AV? = aT igda’ \ da’. (144) 

On the coframe 0%, D? gives an important special case of (143): 
DP as" = Rp Av”, (145) 


which is known as the first Bianchi identity. 
Applying D to the curvature 2-form Rg gives 


DR“ g = dR g + ert AR, —-I%A R°, 
= d(al%g + T°, AT%g) + T%, A (aI %g +T% AT’ g) 
—I%g A (dl%, +T2 AT?,) =0, (146) 


by explicit calculation. This is the second Bianchi identity. 
With g,, the metric tensor, Dg, defines the nonmetricity 1-form. 


Dg = Agu _ hares — 1 Gais (147) 
Correspondingly, we have 


DP Gags = —R or _ BP Gps: (148) 


17.2. Riemann—Cartan geometry 


Here we are interested in particular in the special case where the geometry can be 
regarded as a local gauge theory of an appropriate spacetime symmetry group. With 
due consideration given to the understanding of both the geometry of and physics 
in Minkowski spacetime, the appropriate choice for the symmetry group is the 
inhomogeneous Lorentz group, generally referred to as the Poincaré group. This is 
both the symmetry group for the spacetime of special relativity and the group used 
to classify elementary particles in terms of mass and spin. The group is a semidirect 
product of the translation group and the group of rotations/Lorentz boosts. The 
Noether conserved quantities associated with these global symmetries are energy— 
momentum and angular momentum/CoMM. The type of spacetime geometry with 
local Poincaré symmetry is known as Riemann—Cartan geometry. 

In Riemann—Cartan geometry, the connection is assumed to be a priori metric 
compatible, Dg, = 0, via (148) this gives Rag = Rjagy (ie. a Lorentz Lie alge- 
bra valued two-form). For our purposes, it is convenient to use the orthonormal 
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frame gauge condition, then the metric components are constant and dg,, = 0, 
so, via (148), this gives Tag = D'jagj (ie. a Lorentz Lie algebra valued one-form). 
The geometry has in general nonvanishing torsion and curvature. The metric infor- 
mation is encoded in the orthonormal coframe 0", which has local Lorentz gauge 
freedom. 

Riemannian geometry is a special case with vanishing torsion, T’’ = 0; such 
a connection is called symmetric. Then the geometry is given by the curvature, 
which is generally nonvanishing, so the parallel transport is path dependent. On 
the other hand another special case is teleparallel geometry, which has a vanishing 
curvature 2-form, Rg = 0. This is referred to as flat. Parallel transport is then 
path independent, nevertheless it is generally nontrivial — being dependent on the 
torsion, which is generally nonvanishing. 


17.3. Regarding geometry and gauge 


Note the respective similarities in the form of the commutator of the gauge covari- 
ant derivatives for the flat space U(1) phase case (V,,¢ = 0,¢@+ ieA,@) and the 
Yang-Mills case (Vv = O, + igA?T,~) compared with the spacetime geometric 
case: 


[Vu Vi]o = ieF wd, (149) 
[Vs Viiv = iqh? wTpw, (150) 
Vins Vel =P ge VO" HT pV. (151) 


Here, V,, is the covariant derivative; the latter relation is called the Ricci iden- 
tity. On the right hand side the respective gauge field strengths appear. One can 
see that for spacetime the curvature is the Lorentz field strength, and the torsion 
is the spacetime “translational” field strength, associated with the generator of 
infinitesimal translations, the directional derivative. 

Riemann—Cartan geometry is ideally suited to admit an interpretation as a 
local gauge theory of the symmetry group of Minkowski space, the Poincaré group. 
(In the standard Riemannian GR formulations, the torsion is @ priori assumed to 
vanish, then gravity does not look much like a local spacetime symmetry gauge 
theory. Teleparallel geometry can be regarded as a gauge theory for translations). 

We see that, when suitably formulated, gravity has both Lorentz /rotational and 
translational “vector potentials” which are similar to those of the Maxwell/Yang— 
Mills theories. 


17.4. On the affine connection and gauge theory 


The “connection” one-forms for “translations” and “Lorentz” transformations can 
be packaged together in a way that offers some further insight into their essential 
similarities and differences. 
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The Poincaré transformations on Minkowski spacetime 
VO = AW gVF + AX, (152) 


can be conveniently represented in matrix form as 


OC Oe 


Then the matrix product 


A, A Ag A AjAg AyAg+A 
1 AL 2 Ag) _ (Aide AiAg 1 . (154) 
0 ol 0 1 0 1 


reflects the semi-direct product structure. This matrix representation for infinites- 
imal Poincaré transformations has the Lie algebra 


(; He a) = (i — (155) 
0 0 0 0 0 0 


Now a connection can be viewed as a Lie algebra valued one-form. The spacetime 
“translation” and “Lorentz” connections can thus be neatly packaged in terms of 
the above Poincaré Lie algebra matrix representation: 


Tr ¢ 
wis ( :): (156) 


The associated “curvature” Lie algebra valued 2-form 


aVv+TAr dd+TAV R T 
=dwtwAw= " *F = , (157) 
0 0 0 O 


includes the spacetime curvature and torsion 2-forms in one package. Furthermore, 
the Bianchi identity for this Poincaré Lie algebra curvature matrix, 


0= DO :=dN+wANQ-OQNAw 


7 ee — ae 
- 0 0 


DR DT-RAV 
= ‘ (158) 
0 0 


unifies the first and second spacetime Bianchi identities. 

This packaging shows similarities between the connection and coframe one-forms 
and the curvature and torsion 2-forms, but also some clear differences inherited 
from the semi-direct product structure of the Poincaré group. The gauge theories 
of Yang—Mills!*4 and Utiyama!?:!4° also have the Q = dw +wAw and DQ =0 
form, but the groups do not have a semi-direct product structure. 

Although we find this formulation quite helpful for seeing how the coframe plays 
the role of the “vector potential for translations,” we will not use it below in our 
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treatment of the PG. For our purposes, we consider local Poincaré transformations 
to be Lorentz transformations of the coframe plus local spacetime diffeomorphisms. 


18. Variational Principles for Dynamic Spacetime Geometry 


In this section, we develop the second-order variational principle for gravitating 
material and internal gauge fields along with their associated Noether currents and 
differential identities. The spacetime is assumed to have Riemann—Cartan geometry, 
i.e. we are considering the Poincaré gauge theory of gravity (PG). 

We are considering geometric gravity that can be regarded as a gauge theory for 
the Poincaré group. Several authors have considered such theories, see, e.g. Refs. 7, 
8, 47, 52, 56 and 80. 

We wish to consider the conserved Noether currents and differential identities as 
well as the field equations for dynamic spacetime geometry and gauge interactions. 
Here, in this section, we first work with the usual second-order type Lagrangian, 
since for that case certain expressions take a simpler form and the arguments are 
more transparent. Having established these results, we can then present more briefly 
the analogous first-order version which is the basis for our covariant Hamiltonian 
expressions. 


18.1. The Lagrangian and its variation 
Rather than beginning with the Lagrangian 4-form of the type” 
L=L(p,v",T%g, AP; dy, dv", dl" 3, dA”), (159) 
we take the more covariant form 
L= L(y, 0", T%g, AP; Dy, T*, R%g, FP), (160) 


which is no less general. Here, vy is a generic f-form source field with total covariant 
differential (the factor ordering is suitable for a matrix representation with y as a 
row “vector” ) 


Dy =dp+T%g A you? + A? A YTp. (161) 


The torsion 2-form, curvature 2-form and gauge field strength were given ear- 
lier (144), (139) and (62). The respective variations are 


6Dy = Dog + 61% Avan + 6A? AYTy, (162) 
6T* = D59" + 5T*, AV, (163) 
OR%g = dbl %g + 66% AI%9g + T%) Ad g = DET g, (164) 
OF? = ddA? + C?g,At A 5A" = D6A?. (165) 


“This is sufficiently general to include all the fundamental fields of the standard model, with » 
including the Higgs and the fermions and A? the U(1) x SU(2) x SU(3) gauge vector potential. 
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The variation of the total Lagrangian 4-form (160) is: 


se sop 2 4 ore n 2+ sng n OE open BE 
rien 60" n 2 4 rey ne sar 2S, 08 
= (Dép + 61g A yon? + 6A? A Ty) A De + dp A os 
+ (Dbd" + OT, AVY) A — + Ov A a 
+ Dlg A _ + 60% A are, 
+ DdA? A ae + 5A? A oe. (167) 
“Integrating by parts” and rearranging gives 
6£L=D (sen as + OU A — + Pg A ae + dA? A se) 
dp A ( Da x + 60H A (osm te iar | 
+ 61% A Ga a + pon? A oe + 9 A =) 
NE oie a) 


which yields the conjugate momenta and Euler-Lagrange variational derivatives 
according to the pattern: 
6L = dp Apt 60% A Ty + 51% A pa? + 5A? A Pp) 
dL o£ ye o£ 
+ dp A + ov t OP A dA? A —. 1 
OS 500 o\ Sra, t SAP Mp2) 


This is the key variational relation. According to Hamilton’s principle, the second- 
order field equations are the vanishing of the Euler-Lagrange expressions named 
in (169) and explicitly displayed in (168). 


18.2. Local gauge symmetries, Noether currents and differential 
identities 


The Lagrangian 4-form £ (160) should be “gauge” invariant under the local space- 
time gauge transformations: 


Ap =I%g pou" + a” ~T, — £29, (170) 
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Adt = IH, 0 — £70, (171) 
AT p= Diy — PoP 4, (172) 
AA? = —Da? — £7A?, (173) 


where the 6 /*g control an infinitesimal Lorentz rotation of the spacetime frame 
0°, the a? control an “internal” gauge and Z is a spacetime vector field which 
determines a “local translation.” The Lie derivative £7 is given by diz +izd on the 
components of our fields. This action on the components and on the basis one-forms 
correctly represents the Lie derivative on geometric objects. Under (170)—(173) the 
Lagrangian 4-form £ (160), if it depends on position only through the indicated 
fields, should change according to 


AL=—£7£ = —dizl, (174) 


which happens to be a total differential because £ is a 4-form. With the special 
variations given by (170)—(174) Eq. (169) should be satisfied identically. 
One may collect all the total differential terms on the |.h.s. of the identity giving: 


o£ ye o£ 6£L 
dT (1% g,a°,Z) = Ay A — + AVY A — + AT A AAP \ — 
F(a,a?,Z) = Apr s+ 5ge tAT*0 A ara + ear 
(175) 
where 
OL OL 
a. yP =F Hh 
TI" 9,0”, 2) = ~igk — Ap \ se — AW" Nor 
OL OL 
HAP eA AAP KR 1 
0 oRa, " ORP? oe 


is the generalized total current 3-form. Using (170)-(173) we have in more detail 
(recall the canonical stress tensor (55) and EB := ced —L) 


OL 
a Dp i _]% BW AP 
T(I%g, 0”, Z) = —izl + (£79 — I" aypoq” — a pT) A aDe 
OL OL 
ot. O37 B To DI& ee 
(£70% — 1% 30 )A ara t+ (£z gt Dl 6) Spa, 
OL 
I (£7A” + Da?) A OF?’ (177) 


The second Noether theorem differential identities may be obtained from (175) 
by comparing the coefficients of a?,I%g,Z",da?,dl%g,dZ" on both sides. The 
results will be covariant, but the computation will not be manifestly so — because 
of the Lie derivative terms. However, it so happens that the Lie derivative differs 
from a covariant operation only by a gauge transformation, as the following short 
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computations reveal: 


£zy = dizp +izdp = Dizy +izDy —T%g(Z)yon" — A?(Z)yTp, (178) 
£70" := dizd" +izdd" = Dizd" +izD0" —T",(Z)v", (179) 
£703 := digl3 +izdl"g =izR*g+D(T°s(Z)), (180) 
£7 A” := diz A”? +izdA”? =izF? + D(A?(Z)), (181) 

(in the last two relations on the r-h.s., the D is formally defined by treating A?(Z) 


and ['%g(Z) as tensors). Thus the translation vector field Z induces a modification 
to the gauge parameters 


4 := 1%, +4+T%,(Z), al? =a? + AP(Z), (182) 


which effectively replaces the noncovariant £z by the “covariant Lie derivative” 
defined by 


Lz := Diz +izD, (183) 
on the “normal” fields y, 0°, and by 
Lzlg = izR%g, Lz A” = izF?, (184) 


on the connection one-forms. 
The gauge transformations (170)—(173) then take the manifestly covariant form: 


Ay = l'*gp09" + a? pT, — Lzy =U gypoa" + a? YT, — Dizy —izDe, 


(185) 

Ad = 1,0" +0 — Lg = 14,8" +0 — DZ" — iz Do", (186) 
Al DI 940 Del" 4 = —Di" 440-12, (187) 
AA? =0— Da’? — Lz A? = 0 — Da’? — iF”. (188) 


If one specializes to matter fields which are not form fields — which is the only 
kind of matter that we know of physically — then one can see here a striking 
pattern which supports our identification of the geometric gauge fields. From the 
r.h.s. expressions, one finds that most of the fields transform algebraically under 
gauge infinitesimal internal gauge transformations a”; the only field that transforms 
with Da” is the internal connection one-form (a.k.a. gauge vector potential) A?. 
Similarly, most of the fields transform algebraically in l/“g; the only field which 
transforms according to the differential Di’“g is the spacetime connection one- 
form T',. Moreover (in the “physical” 0-form matter case) the only field having a 
differential — DZ — rather than an algebraic transformation formula under the 
infinitesimal spacetime displacement Z“ is the coframe one-form J". This is one 
more way of seeing that the coframe one-form can be identified as the “translational 
gauge vector potential.” 
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With the above reparametrization, the generalized total current 3-form takes 


the form: 


: e OL 
Ftp, a’?, Z) = —igl + (Lzy > ‘i BVO" a a’? yT,) /N\ aDe 
OL OL 
(LZ a pla B Lol D 10 
(Lz0 l pO") Nara + (Lz gt Dl 6) ape, 
+(LzA? + Da’?) A an (189) 


To extract the differential identities from (175) it is best to write J(l’%g, a’?, Z) 
as terms algebraically linear in l/"g,a’?, Z plus a total differential: 


Tl" g,0'?, Z) = -igL +d x ee +igd® A a | "9 aa | a”) 
+izDyn iD + izDd" A oe +izR%g oR 
+igk? A — +csizp A Da - izd" \ pe 
fs (p — + poo" A we oF A =) 
_ QP (ox + pT y A oy (190) 


The total differential will later be related to the total energy-momentum. For now 
merely note that it does not contribute to the l.h.s. of (175) as d? = 0. 

Internal gauge symmetry. With Z = 0 = I'*g the general result (190) reduces to 
the expression we considered earlier for internal gauge symmetry of the Yang-Mills 
type (64), and we again obtain just as before (recall the argument leading to (76) 
and (70)): 


=0, D + pTp A ie =0. (191) 


Local Lorentz gauge symmetry. In the same fashion, with Z = 0 = a’? we obtain 
from (175) and (185)—(187) 


a a lie, 00 . Ue 
dJ(U'%3,0,0) = a{t (ses aes 


= l'*sy0" A = + U%_98 A + (-DI'%g) A 


me (192) 


OT %g- 
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Equating the coefficients of l’°g and Dl'*g (keeping in mind that l/“, is antisym- 
metric) leads to the algebraic and differential identities: 


OL 
=0 193 
we =o, (193) 
5L 5c se 
Deal + ~Oaps A ip t Cate) x 504 = 0. (194) 


Formally these two relations are quite similar to those found for the internal sym- 
metries (191). Consequently local Lorentz symmetry is in certain respects rather 
like an internal gauge symmetry. 

The conditions 0£/0A? = 0 = O£L/0Tg mean (as we expected) that how the 
internal and Lorentz gauge potentials can appear is quite restricted — i.e. only via 
the covariant derivative or the associated field strength. With these conditions, we 
note that the generalized current has the neat form 


OF gl OP args (195) 


10 /p P\ = LU 10 
Fl Br, ye) Z In l ° ore, 6 AP 


Local diffeomorphism invariance. This was already considered for general theo- 
ries in first-order form in Sec. 14.1. Here, we consider local translations in second- 
order form while distinguishing between the source, gauge and geometric variables. 
This case again has some similarities to the internal and Lorentz symmetries but 
also some striking differences. With I/“3 = 0 =a’? in (175) we have 


dJ (0,0, 2") = d(Z" Jy) = DZ" \ Iu + Z*DIy 


= ; : dL ; ; o£ 
= —(Dizyp +izDe) /\ Be -_ (Diz0" +izDv") /\ soe 
ita bL£ ; ye 
—izR cma Pe, —izgF? A SAP (196) 


The coefficient of Z” on both sides of (196) is the differential identity associated 
with translation invariance (related to energy-momentum) while the coefficient of 
DZ* provides a new algebraic expression for /, in terms of the Euler-Lagrange 
variations: 

dL OL 


Fu =—PpA 5p OOH" (197) 


For ordinary (i.e. 0-form valued) matter fields the first term vanishes, leading to 
the especially simple and intuitively reasonable relation: 
o£ 

aoe 

We emphasize that such a neat formula for the translational current 3-form is only 


In =- (198) 


possible if one regards the coframe as a dynamical variable. It is also noteworthy 
that this remarkable relation — which does not restrict how the translation gauge 
potential appears in the action — was obtained from the coefficient of DZ", whereas 
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the corresponding relation in both the internal and local Lorentz cases put quite 
severe limits, viz. (191a), (193), on how those gauge potentials could appear. 

The identity (197) permits 7(l’%3,a’?, Z) (195) to be written in the nice sym- 
metrical form 


bL£ b£ 


ye 
10 P= * . LU 1 /p 
Fl Br’, ) akan izd SOUL y ° ore, & SAP 
OL OL OL OL 
e 7) | 10 /p . 1 
+4 (ino a i20" ope t! Pape, +o =) vee 


Restricting to the case where the source fields are 0-forms, we have 


dL 6£ b£ 
1 /p == 7) 1m /p 
TU p08, 2) = — OO a a Sra, Sap 
OL OL OL 
a(z pee ane, | al? a): (200) 


This result displays the purely gauge nature of the current — it is especially note- 
worthy that there is no explicit appearance of the source field or its field equation. 

Note that if we take the variational derivatives as field equations, the numerical 
value of JZ will be entirely from the total differential term. When the 3-form 7 
is integrated over a 3D region this total differential becomes, via the fundamental 
boundary theorem, (34), an integral over the 2D boundary of the region. In other 
words, the value is quasi-local. 

Our results here are an application of Noether’s ideas. As expected from her 
2nd theorem, with a local symmetry the conserved current becomes a differential 
identity. Furthermore, we also displayed here detailed results that exactly reflect her 
remarks about verifying and generalizing Hilbert’s assertion regarding the lack of a 
proper energy-momentum. Our generalized current expressions (199) and (200) are 
linear combinations of Euler-Lagrange expressions plus a total differential. They 
have the usual conserved current ambiguity: The total differential does not con- 
tribute to the conservation law, it can be adjusted without affecting the conserva- 
tion property, nevertheless it affects the value of the associated conserved quantity. 
The second-order Lagrangian formalism has no way to control this ambiguity. 


18.3. Interpretation of the differential identities 
Let us now specialize and consider the customary “minimal coupled” decomposition: 
Leotal = Lor(-, or, carne) ihee R%g, 2) af Lal, uae rarer ee) FP) 
PL lp. 4 sD ey tee ls (201) 


Each of these separate Lagrangian pieces is a scalar valued 4-form. So, the Noether 
identities we have obtained can be applied to each piece. However, it must be kept 
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in mind that for the separate pieces our variational derivatives are, in general, no 
longer field equations. There is, however, one exception: 


bLy _ bLtotal 

6p bp 
which vanishes “on shell.” It should also be noted that for each separate piece of 
Ltotal many of the terms in the various identities (191), (194), (196) and (197) will 
vanish trivially. Consider (191): for Lyp it is trivial; for £4 and Ly it reduces to 
the results obtained earlier (79). 

Let us introduce some suitable names for certain expressions. Specifically for 

the material source, the energy-momentum and spin density 3-forms are defined 
respectively by 


(202) 


aL, » Ske 


a Boa, 9) 
ESE. gu” a = aT (203) 
with an analogous expression defining T;}. 
Then (194) for £, and £4 becomes, respectively, 
DOS? + 9° NTE — GF ATS = 2a" A Fo =0 (on shell), (204) 
OAS OF ASS = 0. (205) 


The physical interpretation of these concerns angular momentum conservation (or 
more precisely the exchange of angular momentum with the gravitational field). The 
first term in (204) is the (covariant) divergence of the source spin density, the next 
two terms (the anti-symmetric part of the energy-momentum density) describe the 
change in “orbital” angular momentum. The second relation is an identity which 
shows that gauge fields have symmetric energy-momentum densities and trivial 
angular momentum conservation relations. For Lyp (194) reduces to the identity 


OLor _ Rial OLoer OLor | sa | = 


Rt A (206) 


jock Dp 4 
OR 8 BRoy TMI ( aTal * Agel 


which is satisfied iff Lyp is a local Lorentz scalar. 


Now let us consider the differential identity part of (196). For a 0-form material 
source field, it takes the form 


bLy 
Th 5p 
When the source field equation is satisfied, this relation describes the exchange of 
material energy-momentum with the gravitational field. The analogous expression 
for the gauge field Lagrangian is a little simpler: 


1 
DEfi — te, T” AEE — Fie R°? \ Ggq = D =0 (onshell). (207) 


OLA 
OFP 


D&4 — ie, T” AT) + te, F? AD 


Il 
= 


(208) 


the interpretation is similar. 
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Finally, for the differential identity of (196) applied to Lyr, after some straight- 
forward calculation, we have the identity 


OLor . 7 OLer OLor 
Pu — tent A (p aT” a) 
: OLor . Lor 
_ x8 _ a — 
ig, RA DIE — DT? Nie, =e = 0, (209) 


which is satisfied if Cyr is a scalar valued 4-form constructed out of the coframe, 
torsion and curvature. This is the PG identity that plays a role analogous to that 
of the contracted Bianchi identity in GR. 

To obtain these detailed results, we used the definition of the various Euler— 
Lagrange expressions given in (168) and (169). 


19. First-Order Form and the Hamiltonian 


Here, for our general PG with generic matter and gauge sources, we briefly present 
the first-order form along with the associated Noether currents and differential iden- 
tities and then the associated covariant Hamiltonian including our preferred bound- 
ary term which yields our quasi-local quantities. 


19.1. First-order Lagrangian and local gauge symmetries 


For certain purposes we find a first-order formulation convenient. The first-order 
form of our variational principle for geometry and gauge is 

L'* = De Apt D8* At. + RY A pag + FP A Pp 

= NGa 0" 1? AP ep. Tas Pabs Pp); (210) 


with T°, R°? and pag being a priori antisymmetric. The variation takes the 


pattern 
6L'* =: d(6p Apt 60% AT. + 60°? A pag + SAP A Pp) 
1st 1st 6 ist ol ist 
is a Toe 1 5 AP 
oy A Re + 60% A 50% +6 \ Tab dA? A SAP 
List ofist 6 ist List 
A op A 6T® 4 \ Opa —-A6P,, (211 
op P OTe r dPap Pag + OPp He MN) 
where 
bList OA 5List OA 
=-—cD =D 212 
ie aD E= Bak ie aa (212) 
5List OA bli a OA 
jue se 09°’ 6 Ty = OTe’ oy 
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5List OA bList OA 

= ap a Dig A a}> = of ’ 214 
orang pee ores oe eeaeP Oe O 78] 6 Pap _ Ofag ( ) 
5List OA 5List OA 
—— = DP, - —— vi = FP 4 21 
GAP Pe aap + PleP Sp. OP, on 


It is instructive to compare these (and subsequent) relations with the corresponding 
ones in the previous section. Here, we have twice as many fields and thus twice as 
many Euler-Lagrange expressions, but, on the other hand, the Euler-Lagrange 
expressions are all linear and much simpler. 

The first-order formulation is convenient for imposing certain constraints. In 
particular in order to impose one of the conditions 


R“g =0_ teleparallel connection, (216) 
T° =0 symmetric connection, (217) 


in the first-order formalism we need merely take the potential A to be indepen- 
dent of the corresponding conjugate momenta. The momentum then functions as a 
Lagrange multiplier which imposes the constraint. The related “coordinate” equa- 
tion then loses its dynamical significance and instead becomes a relation for deter- 
mining the multiplier. 


19.2. Generalized Hamiltonian and differential identities 


To obtain the Noether differential identities in this mode we need, in addition 
to (185)—(188), the gauge transformations of the conjugate momenta: 


Ap = —l"oagp — a? Typ — Lzp, 
Arg = —I'" gta — Late, 
Ape? = Ga _ pe’ _ Lipbe 
APq = —PpC? qr a!” — LzPa, 
where Lz = Diz +izD. These results were deduced from relations like 
A(p A Dy) = ApA De +pAA(Dy), (222) 


using the fact that pA Dy is a scalar valued 4-form. 
As before AZ is a total differential: —diz£. Taking the total differential terms 
to the l.h.s. in (211) gives 


dH (I! g, a”, Z) 


6L 1st List 6cist 6ist 
= Api — Ave ATO? 4 —— 4 AAP A 
egg ee ge Y Sfaa SAP 
Ist 1st 1st 1st 
+ = A Ap : A Ata 4 . A Apap + NAP», (223) 


op OTe dPaB OP» 
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where the first-order “generalized current” is 
Hil! a, 0", Z) 
= —izl'* — Ay Ap— A8* At. — AT A pag — AAP A Pp, (224) 
=izA —iz(Dp Ap+ DY* At. +R A pog + FP A Pp) 


+ (Lzy —Ugyoq% — a'?yT,) Ap + (Lz8% — 1% 30") A T 


+ (Lgl + Di’°*) A pag + (Lz A? + Da'?) A Pp, (225) 


=izh+cDy Nizp — DI® Nizte — R“ Nizpag — F? NizPp 


+ (Dizy — I gpoo" — a! pTp) Ap + (Diz8* — 1% 30°) AT. 


+ Di’? 0 pag + Da’? A Py, (226) 
= izgh+cDep Aizp— DY NizTa — Ree A tz Pap — FP AizPp 
+cizy A Dp —izd° Dt. 


= 8 (Daeg + Yoag \p+ Ig A Te}) — a’? (DP, + yTp A p) 


+ Diizy Ap +izd~Te +1? pag + al? Pp). (227) 


This expression is not just a Noether current, it is, as we already justified quite gen- 
erally in Sec. 14.2, the (generalized) Hamiltonian 3-form, i.e. the canonical generator 
for internal, local Lorentz and local spacetime displacements (which includes the 
time evolution for any choice of time). Nevertheless, like the second-order current, 
one still has the various differential identities. 

Local internal gauge symmetry. Equating coefficients of a/? and Da’? in (223) 
gives the algebraic and differential identities: 

OA hy ale i get ry Wald 

par eae PO oe ap Oe ape 
which can be compared with (191). The significance is the same, but now the 
r.h.s. includes also Euler-Lagrange variations w.r.t. momentum variables. If these 


A PrO' nq, (228) 


equations are imposed, the expression has the same form as (191). 
Local Lorentz symmetry. Equating coefficients of I/“? and DI/“° gives the alge- 
braic and differential identities: 


OA 
span = (229) 
oft List 6fist ofist 
Daa = £0 [ag] \ ie f Vg A 50a ap \\ O[ap|P 
6fist 6cist List 
" A. Baers a 
57 la A Tp] 5 ph Ap 2 5p PBy ( 30) 


which should be compared with (193) and (194). The significance is the same, but 
now the r.h.s. of the latter includes also several Euler-Lagrange variations w.r.t. 
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momentum variables. If these relations are imposed, the expression has the same 
form as (194). 
Local translation symmetry. With the decomposition (0,0, Z") = Z"H, + dB, 


DH(0, 0, Z") 
= DZ" AH, + Z*DHy, 


1st 6 Ist 6List ofist 
=_ 7 a = ap _ Pp 
=—-LzypA Lz \ 50 LzV°" A open LzA? \ SAP 
ol ist 6 1st 1st List 
— Lzp- LzTq Lzpa — AL : 231 
jp NEaP— Ga NLata — Fo A LaPop xp, ZPp. (231) 


From the coefficient of Z“, we obtain a differential identity involving 1,,; this 
relation includes the conservation of energy-momentum. 

We also get a new algebraic identity giving 11, in terms of variational derivatives: 
(compare (197)): 


of ist 6List 6List 6List 
He ee a ape TS gp RE ge, 
List 6fist 
ih fog 232 
oe \ Popbu 5P, \ Pou (232) 


When the momentum relations are imposed, this reduces to the corresponding 
expression for the second-order Noether translational current (197). Inserting this 
new result into (227) gives an expression for the Hamiltonian 3-form in terms of 
variational coefficients (which has the same form as (199) if the momentum relations 
are imposed) 


Ho” 2) 


6fist 6fist 6fist fist 
= A A bb 103 'p 
=-izypA ie izd 59H l 5Poa a SAB 

6 ist 6fist 6fist 6fist 
S Aizp AtzZTa 


7 Pe aa 
5p OTe ag OO ee, 


+ Diizy Np +iz8T, +l pag + a'P Pp). (233) 


This generalized current expression is again an example of applying Noether’s 
analysis. In accord with the second theorem, for local symmetries, the current 
conservation expression becomes a differential identity. Again, we have a detailed 
expression that reflects Noether’s remarks regarding a more general version of 
Hilbert’s assertion concerning the absence of a proper energy-momentum conserva- 
tion law. Moreover there is again the Noether current ambiguity regarding the total 
differential term. However, as explained in Sec. 14, the first-order current is also the 
Hamiltonian, the canonical generator of local transformations including spacetime 
displacements. The total differential (boundary) term in the Hamiltonian can and 
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should be adjusted, as we have discussed in general terms earlier in Sec. 14.4. Then, 
when varied, the chosen boundary term in the variation of the Hamiltonian gives 
the associated boundary conditions. Thereby the usual Noether current ambiguity 
is fixed by the chosen boundary condition. 


19.3. General geometric Hamiltonian boundary terms 


Specializing our general Hamiltonian boundary term expression (120) to our present 
variables, with the preferred choice for the material and internal gauge fields leads 
to boundary term expressions which explicitly contain only the geometric variables: 


oe ” 
B(Z) =iz SAtm+A0%Aizd " 
ee Ta 
re oP 
tig) a.” } Apa? + APs Aiz 4, h, (234) 
T B Pa 


where the upper or lower line in each bracket is to be selected. A special case of 
this expression, (upper, lower, upper, upper) with 7, = 0, was first proposed in 
1991.8° With the above boundary term, the total differential term in 5H(Z) has 
the symplectic form 


iz50° A Ato AS* A izét 
C(Z)=4 * + ae 
—izAVd \ dT —b0% \izATe 


i76l% 3 A Apa? AT 3 Aizdpo? 
4 ‘id B p + B kg p. (235) 
—izAT%g /\ Spo? —dlg A izApa® 


The general geometric Hamiltonians evolve gauge dependent quantities, includ- 
ing the frame and connection coefficients. Consequently they naturally include 
terms that are gauge dependent. These terms are in particular those with fac- 
tors of izI%g. However, the value of the Hamiltonian boundary term will then 
include a contribution to the energy, etc. due to the choice of frame gauge. From 
such a boundary term, one could still get the correct physical value, but only if 
one takes on the boundary the particular frame gauge £z0° = 0, which means 
one may need a different frame for the energy-momentum and angular momentum 
components. For the purposes of obtaining directly a physical value for the observ- 
able quantities, one must separate the frame gauge dependent term into a physical 
energy plus a gauge dependent unphysical energy. This issue was first considered 
by Hecht,**49 and he discovered a suitable remedy. The frame gauge dependent 
part can be separated using the identity 


£79 = diz) + izdd* = Dizd* + izDI% — izl% gv". (236) 


With the aid of this relation one can get frame gauge independent boundary terms 
for the quasi-local quantities. Thus, to drop the contribution from the frame gauge, 
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one should make the replacement 
igl%g = DgZ* 4+ (izT*)g — (£70°)e > DgZ%, (237) 
where we have introduced the convenient transposed connection: 
DZ* := DZ* + igT*, (238) 


which naturally shows up whenever one expresses Lie derivative expressions in 
terms of a covariant derivative. In addition to our argument above, this replace- 
ment has been justified using (i) theoretical analysis,4*-*° (ii) calculations for exact 
solutions,®° (iii) holonomic variables (for GR)?? and (iv) via a manifestly covari- 
ant Hamiltonian formulation.®” In our presentation here, we used the coframe both 
for convenience and for its local translational gauge role; the reference just cited 
provides a completely frame independent — manifestly covariant — alternative 
approach to the covariant Hamiltonian formalism. 


19.4. Quasi-local boundary terms 


With the aforementioned replacement, we get our general set of symplectic quasi- 
local quantity boundary terms for the PG: 


vo my 
BZ) =e Var 4am" nied" ! 
ym Te 


Peel A + Arey Aig | PO (239) 
= fad v ? 
Daz Pp. BINUZ aot 


where the upper or lower line in each bracket is to be selected. As in thermody- 
namics, there are several kinds of “energy,” each corresponds to the work done in 
a different (ideal) physical process.®?-® 


19.5. A preferred choice 


The cases of the PG that have been studied are mostly those where the first- 
order potential is quadratic in the momentum fields, this leads to a linear rela- 
tion between the momenta and field strengths, and corresponds to second-order 
quasi-linear equations for the geometric potentials. The natural reference choice is 
Minkowsi spacetime, for which 7, vanishes and D = D. For this class of theories, 
other things being equal, we would favor from among the set (239) the Hamiltonian 
boundary term quasi-local choice (upper, lower, lower, upper), i.e. 


B(Z) = izd*r + AT%g Aizpa® + DpZ*Apa®, (240) 
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which leads to the following boundary term in the variation of the Hamiltonian: 
C(Z) = iz (58% AT — AT%g A dpa). (241) 


This corresponds to imposing boundary conditions on the coframe and the momen- 
tum conjugate to the connection. The associated energy flux expression is 


£7H(Z) = diz(£z0% NT. — AT%g A £zpa"). (242) 


Regarding the total energy-momentum and angular momentum/CoMM, the 
expression (240) matches the expression (with Minkowski reference) for the PG 
Hamiltonian boundary term at spatial infinity that was first proposed by Hecht in 
1993" 


19.6. Einstein’s GR 


Within the PG context, the special case of Riemannian geometry can be reached 
by imposing vanishing torsion with a Lagrange multiplier. In the general first-order 
formulation it is sufficient to take the potential to be independent of the coframe 
conjugate momentum 7,,. 

A first-order Lagrangian for vacuum* GR is 


1 
Lar = per A Pag + DI" A Th yo A (+00 = ztlo) (243) 


which uses a Lagrange multiplier field V“? = V!*4) to give the connection’s conju- 
gate momentum field a value which depends on the orthonormal frame: 


1 
fh es =O. 244 
Pag — 5 Nas = 0 (244) 


Variation of (243) w.r.t. the coframe, connection and their respective momenta 
fields gives the (vacuum) equations: 


1 
60" :0= Dry +VRA Fp lesu (245) 
6°? : 0 = Dpag + Vg A Taj; (246) 
OT, 20 = Dv", (247) 
dpag 0 = RV, (248) 


As expected (247) gives vanishing torsion. From the differential of (244) one gets 


1 
Dap = 5 DH 1h Napys (249) 


*For our purposes concerning the Hamiltonian boundary term, we consider here for simplicity 
just the vacuum case. The results will apply to all situations where the boundary of the region of 
interest is in the vacuum region, outside of the domain of the matter fields. That should include 
most of the cases of physical interest. 
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which vanishes, hence (246) reduces to ¥jg A Taj = 0, from which it follows that 7, 
vanishes. Then (245) with (248) reduces to the vanishing of the Einstein 3-form: 


1 1 
6=— Rh” Atete= ~=G" uh. (250) 
By the way, had we included a suitable source, we would have obtained 
1 ra B 
Q= 7H BN No” p+ Xp (251) 


where &,, is the Hilbert energy-momentum 3-form. Using Dna’ yw = 0, this relation 
can be rearranged as follows!??: 


il 
= [dla A die? a) + Tg A One” ys + rm, A I%, A te” vl + Zu 


0 
2K 


- [alg A Ma? a) —T%y AT 2 A Hee tT BAT a A Nee y] +5. (252) 
The rearrangement identifies a certain superpotential 2-form, 

L, = Ig A ne? us (253) 
and gravitational energy-momentum pseudotensor 3-form, 
Qnty = —P%y AI gs A too yp HI% AT a A Nae: (254) 


These manipulations and the resultant expressions are meaningful in both orthonor- 
mal and holonomic frames. In orthonormal frames, they give the expressions for 
the so-called tetrad-teleparallel gauge current,2® while in holonomic frames we have 
obtained neat form versions of the Freud superpotential (22) and the Einstein pseu- 
dotensor (10). There is a remarkable contrast between the simple short calculation 
given here for these quantities and the long complicated ones discussed in Sec. 4 
that were done in the past using tensor calculus. Via rearrangements of the field 
equations analogous to (252), generalized pseudotensors can be found for the PG.%° 


19.7. Preferred boundary term for GR 


Over 20 years ago using the covariant Hamiltonian symplectic boundary term 
approach, we proposed a quasi-local boundary term for GR?? (an equivalent quasi- 
local expression obtained from a Noether argument using a global background ref- 
erence which was proposed at about the same time by Katz et al.®7°): 


as 


B(Z) oe 


(AT%s Aizna® + DgZ*Ano®); nF = #(O° AVF A+++). (255) 


(This has the form of Hecht’s PG expression restricted to GR and natural extended 
to a boundary that need not be at infinity). The boundary term in the variation of 
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the Hamiltonian has the form 
OH(Z) ~ diz(AT%g A dna”), (256) 


Since n°? = «(0° A ¥V*), this corresponds to fixing the orthonormal coframe 0” (and 
thus the metric) on the boundary. The energy flux expression is 


£7H(Z) = diz(ATg A £20"). (257) 


Like many other boundary term choices, at spatial infinity it gives the ADM, 
MTW,*? Regge—Teitelboim,1°° Beig—O Murchadha,? Szabados!*? energy, momen- 
tum, angular momentum, CoMM. 

It has some special virtues: 


At null infinity: The Bondi-Trautman energy and the Bondi energy flux,?4 


(i) 
(ii) It is “covariant,” 
(iii) It is positive: at least for spherical solutions®® and large spheres, 
(iv) For small spheres it is a positive multiple of the Bel-Robinson tensor,°” 
) First law of thermodynamics for black holes,?4 
) For spherical solutions it has the hoop property,°® 
) 


For a suitable choice of reference it vanishes for Minkowski space. 


(v 
(vi 
(vii 


20. A “Best Matched” Reference 


In this section, we turn to the second ambiguity that is inherent in quasi-local 
energy-momentum expressions: The choice of reference. Minkowski space is the 
natural choice, but one still needs to choose a specific Minkowski space. Recently we 
proposed (i) 4D isometric matching on the boundary and (ii) energy optimization as 
criteria for selecting the “best matched” reference on the boundary of the quasi-local 
region. 

Note: For all other fields, it is appropriate to choose vanishing reference values as 
the reference ground state — the vacuum. But for geometric gravity, the standard 
ground state is the nonvanishing Minkowski metric, so a nontrivial reference is 
essential. One still needs to specify exactly which Minkowski space. 

Reference values can be determined by choosing, in a neighborhood of the 
desired spacelike boundary 2-surface S$, 4 smooth functions y’ = y’(a“), i = 0,1, 2,3 
with dy® A dy! A dy? A dy? 4 0 and then defining a Minkowski reference by 


g = —(dy°)? + (dy*)? + (dy?)? + (dy?)?. (258) 


Geometrically this is equivalent to finding a diffeomorphism for a neighborhood 
of the 2-surface into Minkowski space. The associated reference connection is the 
pullback of the flat Minkowski connection: 


Tg = «%,(f* jy’ ¢ + dy'a) = x*idy' a. (259) 


Here x; is the inverse of y*g, where dy’ = y'gV9. 
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With these standard Minkowski coordinates y’, a Killing field of the reference 
has the form Z* = Z* + AKyy!, where AK’ = A!" with Z* and A*! being constants. 
The 2-surface integral of any one of our Hamiltonian boundary terms then has a 
value linear in these constant values: 


f BZ) = ~Zip.(S) + 58 Jul). (260) 


This implicity determines not only a quasi-local energy-momentum but also a 
quasi-local angular momentum/CoMM. When specialized to GR the integrals 
pr(S), Je(S) in the spatial asymptotic limit agree with accepted expressions for 
these quantities: MTW*? §20.2 and Regge-Teitelboim!® with the refinements of 
Beig-O Murchadha® and Szabados.!#? For the PG at spatial infinity, Hecht*? com- 
pared in detail his expression with the other proposed expressions, e.g. Ref. 10. 
At spatial infinity, with the asymptotics (122), all of our PG symplectic boundary 
terms (239) will give the same values as Hecht’s expression (240). 

For energy-momentum, one takes Z to be a translational Killing field of the 
Minkowski reference. Then the second term in our preferred quasi-local boundary 
expressions (240) and (255) vanishes.Y With Z* = Z* = constant our preferred 
quasi-local expressions now take the form 


Boz) = Zea" ley + (1%, -— 2%; dy? g) A te, Pa]; (261) 
BENZ) = Zeng" 9 — a yay? a) Aue (262) 


20.1. The choice of reference 


To be completed, our Hamiltonian boundary term and the quasi-local energy— 
momentum/angular momentum proposal needs a prescription for choosing a refer- 
ence on the boundary. There are several options; one needs a choice suited to the 
application. 

For an extended region, one may want a global background (see Refs. 101 and 
102 for this approach in GR). Consider for example solar system applications, more 
specifically say we want to calculate using our quasi-local energy flux formula the 
tidal energy flux between Jupiter and its moon Io, that is believed to be responsible 
for Io’s volcanos. (This has already been done by several methods.1!419>) On the 
other hand, if one wishes to study a given metric expressed in a specific coordinate 
system, the analytic approach”® may be a good choice. 

To explicitly determine the specific values of the quasi-local quantities, one needs 
some good way to choose the reference. Minkowski spacetime is the natural choice, 
especially for asymptotically flat spacetimes. However, as noted above, almost any 
four functions will determine some Minkowski reference. With such freedom, one 


¥For GR the second term in (255) also vanishes for 4D isometric matching on S, a condition we 
shall use below. 
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can still get almost any value for the quasi-local quantities. This freedom is the 
quasi-local version of the second type of ambiguity mentioned in the introduction. 

Recently we proposed a program®® to fix the “best” choice for a quasi-local 
reference, i.e. one that is determined by the dynamical fields on the boundary. It 
has two parts: 4D isometric matching and optimization of a certain quantity. Here 
we present it along with a promising alternative optimization.!?” For GR we have 
already found that our new procedure works well for an important special case: A 
certain class of axisymmetric spacetimes!*® 

For the PG, not so much has been done yet. While PG energy-momentum and 
angular momentum calculations at both spatial and future null infinity were done 
long ago,°?>! 
as we know there have not yet been any genuine quasi-local (i.e. for a finite region) 
PG calculations. This is not so surprising. In general the big obstacle is the 2D 
isometric embedding, which we are about to discuss. For spherical symmetry, all the 
calculations at least for GR can easily be done analytically. For the aforementioned 
class of axisymmetric metrics, the 2D embedding problem has an algebraic solution. 
But the boundary expressions are already sufficiently complicated that the quasi- 
local energy integral for GR could only be evaluated numerically. Now that it is 
known how to do the case of axisymmetric GR the way is open for truly quasi- 
local PG energy and angular momentum calculations. For the PG, it seems that 
numerical calculations will be unavoidable. 


which includes the Kerr metric. 


and gave, in particular, the expected results for the Kerr metric, as far 


20.2. Isometric matching of the 2-surface 


We first recall an important procedure that has been used: Isometric matching of 
the 2-surface S'. This can be expressed in terms of quasi-spherical foliation adapted 
coordinates t, 7,0, as 


JAB=GAB = Fi VAY = YAY + SaYAYB: (263) 
where S$ is given by constant values of t,r, and a,b = 1,2,3 while A,B range 
over 0,y. We use = to indicate a relation which holds only on the 2-surface S. 
Equation (263) is three conditions on the four functions y’. One can regard y° as 


the free choice. From a classic closed 2-surface into R? embedding theorem — as 
long as S and y°(x") are such that on 9 


Jap = 9aB t+ yay, (264) 


is convex — one has a unique embedding. Wang and Yau have discussed in detail 
this type of embedding of a 2-surface into Minkowski controlled by one function in 
their recent quasi-local work.25:!43:144 

Unfortunately, although there is a unique embedding, there is generally no 
explicit analytic formula except in special simple cases, such as spherical symmetry 
or axisymmetry. The lack of an explicit formula for the solution of this 2D isometric 
embedding prevents exact quasi-local calculations in general. 
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20.3. Complete 4D isometric matching 


Our “new” proposal’ is: Complete 4D isometric matching on S: g=g, a part of 
which is still the just discussed 2D isometric embedding. 

In view of isometric matching, there should be a Lorentz transformation which 
on the boundary brings the dynamical coframe into line with the reference frame: 


0 = LD 8% = dy". (265) 
The integrability condition for this equation is 
did" |5 = 0. (266) 


This 2-form equation restricted to the 2-surface gives 4 conditions on the 6 param- 
eter set of Lorentz transformations L',. Thus 4D isometric matching has 6 — 4 = 2 
degrees of freedom. They can be identified as the choice of time embedding function 
y° in (263) plus a boost parameter a in the plane normal to S. 


20.4. Optimal energy 


To fix the remaining two isometric matching parameters, one can regard the quasi- 
local value as a measure of the difference between the dynamical and the reference 
boundary values. This value will be a functional of y°, a. The critical points of this 
functional determine the distinguished choices for these two functions. 

Previously we proposed®® taking the optimal “best matched” embedding as the 
one which gives an extreme value to the associated invariant mass m? = —p;p;g"’. 
This should determine the reference up to a Poincaré transformation. 

This is a reasonable condition, but, unfortunately, not so practical. The invariant 
mass is a sum of four terms, each quadratic in an integral over S$. Note, however, 
that using the Poincaré freedom, one can get the same m value in the center-of- 
momentum frame from po. This leads us to our new proposal: Take the preferred 
reference as one that gives a critical value to the quasi-local energy given by (260) 
and (262) or (261) with Z* = Z* = ok. We expect this much simpler optimization 
to give the same reference geometry as that obtained from using m?. 

Based on some physical and practical computational arguments, it seems reason- 
able to expect a unique solution in general. In a numerical calculation in principle 
one could just calculate the energy values given by (260) and (261) or (262) with 
Z* = 6k for a great many choices of y°,a@ subject to the 4D isometric matching 
constraint (265) and the integrability condition (266), and then note the energy 
critical points. 

For GR, this “best matching” procedure already gave reasonable quasi-local 
energy results for spherically symmetric systems.207*149:159 For the Schwarzschild 


“For GR, this was proposed by Szabados at a workshop in Hsinchu, Taiwan in 2000. He has since 
investigated it in detail.133 
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metric, the “best matched” quasi-local energy has the value first found by Brown 
and York,!” 


rr (267) 


2m’ 
14+ 4/1—— 
rp 


using a closely related boundary term and a 2-surface embedding into R?. Now 
we also have sensible results for certain axisymmetric systems including the Kerr 
metric.!?° For the surface of constant Boyer—Lindquist r, the angular momentum 
is simply a constant independent of r, equal to its usual asymptotic value, J. The 
quasi-local energy integral, however, is not so simple and can only be evaluated 
numerically. 


21. Concluding Discussion 


In addition to her two key theorems regarding symmetry, Emmy Noether in her 
1918 paper also proved that for diffeomorphically invariant gravity, there is no 
proper total energy-momentum density. In other words there is no covariant total 
energy momentum density tensor for gravitating systems. In Ch. 20 of Ref. 82, Mis- 
ner, Thorne and Wheeler discuss this feature as a consequence of the equivalence 
principle and argue that only the total energy-momentum of gravitating systems 
is meaningful. But clearly the gravitational interaction is local and does allow for 
the local exchange of energy-momentum. To account for this various noncovariant 
expressions called pseudotensors (each generated by a certain superpotential) have 
been proposed. There thus arose two ambiguities: which expression? in which refer- 
ence frame? The modern idea is quasi-local: energy-momentum is associated with 
a closed 2-surface. 

One approach, which is the one we have used, is via the Hamiltonian. With 
the aid of differential forms and a first-order variational formulation, we have 
developed a covariant Hamiltonian formalism. The Hamiltonian boundary term 
plays key roles: It determines the boundary conditions and the quasi-local val- 
ues. We have shown that this approach includes all the traditional pseudotensors 
while clarifying the ambiguities: Each superpotential is a possible Hamiltonian 
boundary term which is associated with a specific boundary condition, and the 
reference frame becomes a choice of the reference values (ground state) on the 
boundary. 

One can regard gravity as a local gauge theory for the global symmetry group for 
Minkowski spacetime: The Poincaré group. The appropriate geometry is Riemann— 
Cartan: The curvature is the field strength for Lorentz transformations and the 
torsion is the field strength for local translations (infinitesimal diffeomporhisms). 
For comparison, we included in our discussion a general internal gauge field. In this 
way, we can identify the analog of the internal gauge vector potential. For the local 
Lorentz symmetry, it is the (metric compatible) connection one-form; for the local 
translation symmetry, it is the orthonormal coframe. We developed in considerable 
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detail the associated Lagrangian and Hamiltonian Noether currents and differential 
identities, so one can compare the similarities and differences of the spacetime 
gauge symmetry expressions with those of a generic internal symmetry. Briefly, 
the Lorentz rotation sector is quite similar to that of an internal gauge symmetry, 
but the translation symmetry has both striking similarities and differences. This 
would be much less clear if we had opted for a formulation which does not include 
the coframe as a dynamical variable. In the approach of Blagojevié and Hehl,® the 
use of the orthonormal frame is motivated by the need to describe spin; here our 
basic motivation is in terms of the coframe’s fundamental gauge role regarding local 
spacetime translations. 

The geometric/gauge symmetry approach is helpful in identifying a good expres- 
sion for our Hamiltonian boundary term expression for quasi-local quantities. Our 
preferred expression for GR turns out to correspond to fixing the metric on the 
boundary — which is obviously a reasonable boundary condition choice. 

A notable feature of the Hamiltonian boundary term for dynamic geometry 
is that it necessarily depends on the choice of some nondynamical reference val- 
ues (this is a manifestation of Noether’s result regarding nonexistence of a proper 
energy-momentum density). One can get almost any quasi-local value if one allows 
free rein in the choice of reference. One needs to fix a nondynamical reference frame, 
but only on the boundary of the region. This can be compared to choosing some 
flat plane to map a part of the curved surface of the Earth. One could slice the 
plane through the surface of the Earth; for a spherical Earth the planar geometry 
would exactly match on a circle. Similarly for spacetime. On the 2D boundary of a 
spatial region, one can exactly match the curved 4D spacetime metric with a flat 
Minkowski spacetime metric. Detailed analysis shows that there is still two degrees 
of freedom. We proposed that a good way to fix these was by considering the critical 
points of our quasi-local expression. There might be other sensible options, but the 
main point is that a reasonable choice is available. 

Our principal results, the Hamiltonian boundary terms that are our preferred 
quasi-local energy-momentum expressions (240) and (255) for the PG and GR, were 
obtained by considering the Hamiltonian, geometry, Noether symmetry, and space- 
time gauge theory. The harmonious combination of all of these perspectives makes 
a strong case for the results. Nevertheless, it should be noted that one can also 
be led to this result from other perspectives. Regarding GR, essentially the same 
expression has been obtained (i) via a Noether approach with a global reference®*»’® 
and (ii) via a symplectic covariant Hamiltonian approach using the metric in a holo- 
nomic frame.?? For the PG (including GR as a special case), the same preferred 
expression was found via an entirely frame independent manifestly covariant Hamil- 
tonian formalism.®” Although in principle there are unlimited number of possible 
Hamiltonian boundary term quasi-local energy-momentum expressions — which 
are in a formal sense all of equal status — in practice one can — with very good 
reasons — discover that one expression stands out. 
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After reviewing the meaning of various equivalence principles and the structure of elec- 
trodynamics, we give a fairly detailed account of the construction of the light cone and 
a core metric from the equivalence principle for photons (no birefringence, no polariza- 
tion rotation and no amplification/attenuation in propagation) in the framework of linear 
electrodynamics using cosmic connections/observations as empirical support. The cosmic 
nonbirefringent propagation of photons independent of energy and polarization verifies 
the Galileo Equivalence Principle (Universality of Propagation) for photons/ electromag- 
netic wave packets in spacetime. This nonbirefringence constrains the spacetime consti- 
tutive tensor to high precision to a core metric form with an Abelian axion degree and a 
dilaton degree of freedom. Thus comes the metric with axion and dilation. Constraints 
on axion and dilaton from astrophysical/cosmic propagation are reviewed. Edtvés-type 
experiments, Hughes—Drever-type experiments, redshift experiments then constrain and 
tie this core metric to agree with the matter metric, and hence a unique physical metric 
and universality of metrology. We summarize these experiments and review how the 
Galileo equivalence principle constrains the Einstein Equivalence Principle (EEP) the- 
oretically. In local physics this physical metric gives the Lorentz/Poincaré covariance. 
Understanding that the metric and EEP come from the vacuum as a medium of elec- 
trodynamics in the linear regime, efforts to actively look for potential effects beyond 
this linear scheme are warranted. We emphasize the importance of doing Edtvés-type 
experiments or other type experiments using polarized bodies/polarized particles. We 
review the theoretical progress on the issue of gyrogravitational ratio for fundamental 
particles and update the experimental progress on the measurements of possible long 
range/intermediate range spin-spin, spin—-monopole and spin—cosmos interactions. 


Keywords: Equivalence principles; spacetime structure; general relativity; classical 
electrodynamics; polarization; spin. 


1. Introduction 


In the genesis of general relativity, there are two important cornerstones: the Ein- 
stein Equivalence Principle (EEP) and the metric as the dynamic quantity of 
gravitation (see, e.g. Ref. 1). With research activities on cosmology thriving, people 
have been looking actively for alternative theories of gravity again for more than 
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30 years. Recent theoretical studies include scalars, pseudoscalars, vectors, metrics, 
bimetrics, strings, loops, etc. as dynamic quantities of gravity. It is the aim of this 
review to look for the foundations of gravity and general relativity, especially from 
an empirical point of view. 

Relativity sprang out from Maxwell-Lorentz theory of electromagnetism. 
Maxwell equations in Gaussian units are 


V:-D=A4rp, (la) 

OD 
Veto 3. = (1b) 
V- B=0, (1c) 

OB 


where D is the displacement, H the magnetic field, B the magnetic induction, E 
the electric field, p the electric charge density and J the electric current density. 
We use units with the nominal light velocity c equal to 1 (see, e.g. Ref. 2, p. 
218 (6.28)). With the sources known, from these equations with eight components 
we are supposed to be able to solve for the unknown fields D,H, B and FE with 
12 degrees of freedom. These equations form an under determined system unless 
we supplement them with relations. These relations are the constitutive relation 
between (D, H) and (£, B) (or (D, B) and (E, H)): 


(D, H) = x(E, B), (2) 


where ,(£,B) is a 6-component functional of E and B. With the constitutive 
relation, the unknown degrees of freedom become 6, the Maxwell equations seem 
to be over determined. Note that if we take the divergence of (1d), by (1c) it is 
automatically satisfied. Hence (1c) and (1d) (the Faraday tetrad) have only three 
independent equations. If we take the divergence of (1b), by (la) it becomes the 
continuity equation 


a constraint equation on sources. Hence, (la) and (1b) (the Ampére-Maxwell 
tetrad) have only three independent equations also. To form a complete system 
of equations, we need equations governing the action of the electric field and mag- 
netic induction on the charge and current. Lorentz force law provides this link and 
completes the system: 
du 

Fom7 =a +v x B), (4) 
where v is the velocity of the charge and F is the force on it due to electric field 
and magnetic induction. 
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In 1908, Minkowski**+ put the Maxwell equations into geometric form in four- 
dimensional spacetime with Lorentz covariance using Cartesian coordinates x, y, z 
and imaginary time zt and numbering them as 71 = 7, %2 = y,v3 = z and «4 = it. 
Minkowski defined the 4-dim excitation (“"®) f and the 4-dim field strength Mi") F 
as 


0 H, —-H, -iD, 
(Mink) ¢ = (Ane) Fe) = 
(Mink) jr = ((Mink) fy.) = : (5b) 


In terms of these quantities, Minkowski put the Maxwell equation into the 4-dim 
covariant form: 


(Min) shh = —s,, (6a) 
CE sg = 0, (6b) 
with 
Finn = (5) Cnkimllm, and s;, the 4-current. (6c) 
Here €ppim = £1,0 is the totally antisymmetric Levi-Civita symbol with e)3, = 


+1. Equations (6a), (6b) are covariant in the sense that for the linear transforma- 
tions with constant coefficients that leave the form 


Tp&p = (x1)? + (x2)? + (a3)? + (a4)? (7) 


invariant, the Maxwell equations in the form (6a), (6b) are covariant with the 
4-dim excitation i") f and the 4-dim field strength i") F transforming as 4-dim 
covariant tensors (covariant V-six-vectors). 

Bateman® used time coordinate ¢ instead of x4, and considered transformations 
that leave the invariance of the differential (form) equation: 


(dx)? + (dy)? + (dz)? — (dt)? =0. (8) 


Hence, he also included conformal transformations in addition to Lorentz trans- 
formations and made one step toward general coordinate invariance. With indefi- 
nite metric, one has to distinguish covariant and contravariant tensors and indices. 
Aware of this, one can readily put Maxwell equations into covariant form without 
using imaginary time. 
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In terms of field strength Fy;(E, B) and excitation (density) H’(D, H): 


Fry = Fi (9a) 


D3 —H. Hy, 0 
Maxwell equations can be expressed as 
HY , = —4rJ', (10a) 
ed P= 0, (10b) 
with the constitutive relation (2) between the excitation and the field in the form: 
He ae), (11) 


where J* is the charge 4-current density (p,J) and e¥* the completely anti- 
symmetric tensor density (Levi-Civita symbol) with e°!?3 = 1 (see, e.g. Ref. 6). 
Here y) (Fj) is a functional with six independent degrees of freedom. For medium 
with a local linear response or in the linear local approximation, (11) reduced to 


HY _ xo" Fe, (12) 


with ”*" the (linear) constitutive tensor density.°—!° For isotropic dielectric and 
isotropic permeable medium, the constitutive tensor density has two degrees of free- 
dom; for anisotropic dielectric and anisotropic permeable medium, the constitutive 
tensor density has 12 degrees of freedom; for general linear local medium (with 
magnetoelectric response), the constitutive tensor has 21 degrees of freedom. 
Introducing the metric g;; as gravitational potential in 1913 (Ref. 11) and versed 
in general (coordinate-)covariant formalism in 1914,'* Einstein put the Maxwell 


equations in general covariant form (§7 = H” in our notation)": 


Se = —4r J’, (13a) 
Ej ok +E jt +Piyg = 0. (13b) 


Shortly after Einstein constructed general relativity, Einstein noticed that the 
Maxwell equations can be formulated in a form independent of the metric gravita- 
tional potential in 1916.!° Einstein introduced the covariant V-six-vector equations 
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(13a) and (13b) which are independent of metric gravitational potential. Only the 
constitutive tensor density y¥"' is dependent on the metric gravitational potential: 


FF = (-g)/? gg)! Fu. (14) 


Noticing Einstein’s § is our H’’ and putting (14) in the form of (12), we have 


xe = (—g)? (5) gi* gi — (5) a's. (15) 


In local inertial frame the metric-induced constitutive tensor (15) is reduced to 
special-relativistic Minkowski form: 


xi = (—g)? (5) er = (5) tad +00"), (16) 


which is dictated by the EEP. 

In macroscopic medium, the constitutive tensor gives the medium-coupling to 
electromagnetism; it depends on the (thermodynamic) state of the medium and in 
turn depends on temperature, pressure etc. In gravity, the constitutive tensor gives 
the gravity-coupling to electromagnetism; it depends on the gravitational field(s) 
and in turn depends on the matter distribution and its state. 

In gravity, a fundamental issue is how to arrive at the metric from the constitu- 
tive tensor through experiments and observations. That is, how to build the metric 
empirically and test the EEP thoroughly. Are there other degrees of freedom to be 
explored? 

Since ordinary energy compared to Planck energy is very small, in this situation 
we can assume that the gravitational (or spacetime) constitutive tensor is a linear 
and local function of gravitational field(s), i.e. (12) holds. Since the second half 
of 1970s, we have started to use the following Lagrangian density L (= ie + 


ie) with the electromagnetic field Lagrangian re) and the field—current 


interaction Lagrangian i given by 
1 e il 3 
pM 2 | = |) Brg — | —— jr ee 
I 16m : ing ~ 
LEMP) _ _ ALF, (17b) 
for studying this issue.!4—!® Here y¥*! = —yJ**! — y*"47 is a tensor density of the 


gravitational fields or matter fields to be investigated, Fi; = Aj; — Ai,j the electro- 
magnetic field strength tensor with A; the electromagnetic 4-potential and comma 
denoting partial derivation, and J* the charge 4-current density. The Maxwell 
equations (10a), (10b) or (1a)—(1d) can be derived from this Lagrangian with the 
relation (12) and (9a), (9b). Using this y-framework, we have demonstrated the 
construction of the light cone core metric from the experiments and observations 
as in Table 1.'” After presenting the meaning of various equivalence principles in 
Sec. 2 and the structure of premetric electrodynamics in Sec. 3.1, we give a fairly 
detailed account of the construction of the metric together with constraints on 
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Table 1. Constraints on the spacetime constitutive tensor x“! and construction of the space- 
time structure (metric + Abelian axion field y+ dilaton field w) from experiments/observations in 


skewonless case (U: Newtonian gravitational potential). gj; is the particle metric.!7 


Experiment Constraints Accuracy 
Pulsar signal propagation 10-16 
Radio galaxy observation x“! — (1/2)(—h)!/2 [h*® hd! — hU AKI] + pe'dh 10—32 
Gamma ray burst (GRB) 10~38 
Cosmic Microwave wool 8x 1074 


Background (CMB) 
spectrum measurement 


Cosmic polarization rotation ~y— po(=a) > 0 |(a)| < 0.02, 
(CPR) experiment ((a — (a))?)1/2 < 0.03 
Eoétv6s—Dicke—Braginsky wot 10-19 U 
experiments hoo — goo ig 
Vessot—Levine redshift hoo — goo 1.4 x 10-4AU 
experiment 
Hughes—Drever-type Ryv > Juv 10-24 
experiments hon — gov 10—19-10-20 
hoo — goo 10749 


Abelian axions, dilatons and skewons from the equivalence principle for photons in 
the framework of premetric electrodynamics using cosmic observations as empirical 
support in Secs. 3.2-3.6. Section 3.7 discusses the special case of spacetime/medium 
with constitutive tensor induced by asymmetric metric and its special role. Section 
3.8 addresses the issue of empirical foundation of the closure relation. 

In Sec. 4, we review theorems and relations among various equivalence princi- 
ples using the y-framework including particles and the corresponding y-framework 
for the nonabelian field. In Sec. 5, we discuss the relation of universal metrology 
and equivalence principles. In Sec. 6, we review theoretical works on the gyro- 
gravitational effects. In Sec. 7, experimental progress on the measurement of long 
range/intermediate range spin-spin, spin—-monopole and spin—cosmos interactions 
is updated. In Sec. 8, prospects are discussed. 


2. Meaning of Various Equivalence Principles 


Our common understanding and formulation of gravity can be simply described 
in Table 2. Matter produces gravitational field and gravitational field influences 
matter. In Newtonian theory of gravity,'® the Galileo Weak Equivalence Principle 
(WEP I)!9 determines how matter behaves in a gravitational field, and Newton’s 
inverse square law determines how matter produces gravitational field. In a rela- 
tivistic theory of gravity such as a metric theory, the EEP determines how matter 
behaves in a gravitational field, and the field equations determine how matter pro- 
duces gravitational field(s). In Einstein’s general relativity, with a suitable choice 
of the stress—energy tensor, the Einstein equation can imply the EEP. In nonmetric 
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Table 2. Gravity and electromagnetism. 


Matter produces 5, Gravitational influence(s), Matter 
Field(s) 
Newtonian Gravity Inverse Square Law WEP{I] 
Relativistic Gravity Field Equation(s) EEP or substitute 
e.g., Einstein equation 


Charges _PieeS 5, Electromagnetic _influences Charges 
Field 


Electromagnetism Maxwell Equations Lorentz Force Law 


theories of gravity, other versions of equivalence principles may be used. The above 
situations can be summarized in Table 2 together with those for electromagnetism. 

From Table 2, we see the crucial role played by equivalence principles in the for- 
mulation of gravity. In the following, we start with the ancient concepts of inequiv- 
alence and discuss meaning of various equivalence principles. This section is an 
update of Sec. 2 of Ref. 16. 


2.1. Ancient concepts of inequivalence 


From the observations that heavy bodies fall faster than light ones in the air, 
ancient people, both in the orient and in the west, believe that objects with different 
constituents behave differently in a gravitational field. We now know that this is 
due to the inequivalent responses to different buoyancy forces and air resistances. 


2.2. Macroscopic equivalence principles 


(i) Galileo equivalence principle (WEP 1)*9 


Using an inclined plane, Galileo (1564-1642) showed that the distance a falling 
body travels from rest varies as the square of the time. Therefore, the motion is 
one of constant acceleration. Moreover, Galileo wrote that “the variation of speed 
in air between balls of gold, lead, copper, porphyry, and other heavy materials is 
so slight that in a fall of 100 cubits (about 46m) a ball of gold would surely not 
outstrip one of copper by as much as four fingers. Having observed this, I came 
to the conclusion that in a medium totally void of resistance all bodies would 
fall with the same speed (together)”!°; thus Galileo had grasped an equivalence 
in gravity. (Galileo had demonstrated the equivalence in his experiment on the 
inclined plane around 1592.) The last conjecture is the famous Galileo equivalence 
principle; it serves as the beginning of our understanding of gravity. More precisely, 
Galileo equivalence principle states that in a gravitational field, the trajectory of a 
test body with a given initial velocity is independent of its internal structure and 
composition (universality of free fall trajectories). 
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From Galileo’s observations, one can arrive at the following two well-known 
conclusions: 


(a) The gravitational force (weight) at the top of the inclined plane and that at a 
middle point of the inclined plane can be regarded the same to the experimental 
limits in those days. Hence a falling body experiences a constant force (its weight). 
The motion of a falling body is one of constant acceleration. Therefore a constant 
force f induces a motion of constant acceleration a. Hence force and acceleration 
(not velocity) are closely related. If one changes the inclinations of the plane to get 
different “dilutions” of gravity, one finds 


faa (18) 


for a falling body. From Galileo’s observation of the universality of free fall tra- 
jectories, we know that the acceleration a is the same for different bodies. But f 
(weight) is proportional to mass m. Hence for different bodies, 

‘a 

— xa. 19 

- (19) 
If one chooses appropriate units, one arrives at 


f =ma (20) 


for falling bodies. lf one further assumes that all kinds of forces are equivalent in 
their ability to accelerate and notices the vector nature of forces and accelerations, 
one would arrive at Newton’s second law, 


f =ma. (21) 
(b) From Galileo equivalence principle, the gravitational field can be described 


by the acceleration of gravity g. Newton’s second law for N particles in external 
gravitational field g is 


da = 
tthe ae =mg(@r)+ >) Frs(@r—a@y) (I=1,...,N) JAD, (22) 
J=1 


where F';7 is the force acting on particle IJ by particle J. At a point ao, expand 
g(x,) as follows 


g(£1) = Go + A- (#1 — 0) + O(|x1 — xo/*). (23) 


Choosing 2p as origin and applying the following non-Galilean spacetime coor- 
dinate transformation 


1 
v= 2- (5) aot’ f =k (24) 
(22) is transformed to 


md? x', 


N 
ye = DL Fret 25) + Oe) FH 1... NIA, (25) 
J=1 


Equivalence principles, spacetime structure and the cosmic connection 1-273 


Thus we see that locally the effect of external gravitational field can be trans- 
formed away. Thus we arrive at a Strong Equivalence Principle (SEP). Therefore 
in Newtonian mechanics, 


Galileo Weak Equivalence Principle = Strong Equivalence Principle. 


In the days of Galileo and Newton, the nature of light and radiation was con- 
troversial and had to wait for further development to clarify it. 


(ii) The second weak equivalence principle (WEP II) 


Since the motion of a macroscopic test body is determined not only by its trajectory 
but also by its rotation state, we have proposed from our previous studies?®?! the 
following stronger weak equivalence principle to be tested by experiments, which 
states that in a gravitational field, the motion of a test body with a given initial 
motion state is independent of its internal structure and composition (universality 
of free fall motions). By a test body, we mean a macroscopic body whose size is 
small compared to the length scale of the inhomogeneities of the gravitational field. 
The macroscopic body can have an intrinsic angular momentum (spin) including 
net quantum spin. 


2.3. Equivalence principles for photons (wave packets of light) 
(i) WEP I for photons (wave packets of light): 


In analogue to the Galileo equivalence principle for test bodies, the WEP I for 
photons states that the spacetime trajectory of light in a gravitational field depends 
only on its initial position and direction of propagation, does not depend on its 
frequency (energy) and polarization. 


(ii) WEP II for photons (wave packets of light): 


The trajectory of light in a gravitational field depends only on its initial position and 
direction of propagation, not dependent of its frequency (energy) and polarization; 
the polarization state of the light does not change, e.g. no polarization rotation for 
linear polarized light; and there is no amplification/attenuation of light. 

N.B. We consider the propagation (or trajectory) in eikonal approximation, i.e. 
in geometrical optics approximation. The wavelength must be small (just like a test 
body) than the inhomogeneity scale of the gravitational field. 


2.4. Microscopic equivalence principles 


The development of physics in the 19th century brought to improved understand- 
ing of light and radiations and to the development of special relativity. In 1905, 
Einstein?? postulated the equivalence of mass and energy and proposed the famous 
Einstein formula E = mc?. A natural question came in at this point: How light and 
radiations behave in a gravitational field? In 1889, Edtvés?? experiment showed 
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that inertial mass and gravitational mass are equal to a high precision of 107°. In 
June, 1907, Planck?* reasoned that since all energies have inertial properties, all 
energies must gravitate. This paved the way to include the energy in the formulation 
of equivalence principle. 

N.B. Since the power of EEP only reaches the gradient of gravity potential, 
it applies only to a region where the second-order gradients or curvature can be 
neglected. In applying the equivalence principle to wave packets or a microscopic 
wave function, we have to assume that the extension is limited to such a region. 
For example, it should not be applied to a long-distance entangled state. 


(i) Einstein equivalence principle 


Two years after the proposal of special relativity and the formula E = mc?, six 


months after Planck reasoned that all energy must gravitate, Einstein,?° in the last 
part (Principle of Relativity and Gravitation) of his comprehensive 1907 essay on 
relativity, proposed the complete physical equivalence of a homogeneous gravita- 
tional field to a uniformly accelerated reference system: “We consider two systems 
of motion, ©; and “2. Suppose % is accelerated in the direction of its X-axis, and 
y is the magnitude (constant in time) of this acceleration. Suppose “2 is at rest, 
but situated in a homogeneous gravitational field, which imparts to all objects an 
acceleration —7¥ in the direction of the X-axis. As far as we know, the physical laws 
with respect to %, do not differ from those with respect to Ng, this derives from the 
fact that all bodies are accelerated alike in the gravitational field. We have therefore 
no reason to suppose in the present state of our experience that the systems 4 
and 2 differ in any way, and will therefore assume in what follows the complete 
physical equivalence of the gravitational field and the corresponding acceleration of 
the reference system.”* From this equivalence, Einstein derived clock and energy 
redshifts in a gravitational field. When applied to a spacetime region where inho- 
mogeneities of the gravitational field can be neglected, this equivalence dictates 
the behavior of matter in gravitational field. The postulate of this equivalence is 
called the EEP. EEP is the cornerstone of the gravitational coupling of matter and 
nongravitational fields in general relativity and in metric theories of gravity. 

EEP is a microscopic principle and may mean slightly different things for differ- 
ent people. To most people, EEP is equivalent to the coma-goes-to-semicolon rule 
for matter (not including gravitational energy) in gravitational field. Therefore, 
EEP means that in any and every local Lorentz (inertial) frame, anywhere and 


*Finstein further clarified the application of this equivalence to inhomogeneous field, e.g. in his 
book “The Meaning of Relativity” (p. 58, Fifth edition, Princeton University Press, 1955): “...We 
may look upon the principle of inertia as established, to a high degree of approximation, for the 
space of our planetary system, provided that we neglect the perturbations due to the sun and 
planets. Stated more exactly, there are finite regions, where, with respect to a suitably chosen 
space of reference, material particles move freely without acceleration, and in which the laws of 
special relativity, which have been developed above, hold with remarkable accuracy. Such regions 
we shall call “Galilean regions”. We shall proceed from the consideration of such regions as a 
special case of known properties” . 
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anytime in the universe, all the (nongravitational) laws of physics must take on 
their familiar special-relativistic forms.?° That is, local (nongravitational) physics 
should be universally special relativistic. In other words, EEP says that the outcome 
of any local, nongravitational test experiment is independent of the velocity of the 
apparatus. For example, the fine structure constant a = e?/hce must be independent 
of location, time, and velocity. 


(ii) Modified Einstein equivalence principle 


In 1921, Eddington?’ mentioned the notion of an asymmetric affine connection in 
discussing possible extensions of general relativity. In 1922, Cartan?® introduced 
torsion as the anti-symmetric part of an asymmetric affine connection and laid 
the foundation of this generalized geometry. Cartan”? proposed that the torsion of 
spacetime might be connected with the intrinsic angular momentum of matter. In 
1921-1922, Stern and Gerlach*° discovered the space quantization of atomic mag- 
netic moments. In 1925-1926, Goudsmit and Uhlenbeck*! introduced our present 
concept of electron spin as the culmination of a series of studies of doublet and 
triplet structures in spectra. Following the idea of Cartan, Sciama®?:*? and Kibble** 
developed a theory of gravitation which is commonly called the Einstein—Cartan— 
Sciama—Kibble (ECSK) theory of gravity. 

After the works of Utiyama,°° Sciama??*3 and Kibble,** interest and activi- 
ties in gauge-type and torsion-type theories of gravity have continuously increased. 
Various different theories postulate somewhat different interaction of matter with 
gravitational field(s). In ECSK theory, in Poincaré gauge theories?®” 
other torsion theories, there is a torsion gravitational field besides the usual met- 
ric field.** In special relativity, if we use a nonholonomic tetrad frame, there is 
an antisymmetric part of the affine connection. Therefore many people working 
on torsion theory take the equivalence principle to mean something different from 
EEP so that torsion can be included. This is most clearly stated in von der Heyde’s 
paper “The Equivalence Principle in the U4 Theory of Gravitation” [39]: Locally the 
properties of special relativistic matter in a noninertial frame of reference cannot 
be distinguished from the properties of the same matter in a corresponding gravita- 
tional field. This Modified Einstein Equivalence Principle (MEEP) allows for formal 
inertial effects in a nonholonomic tetrad frame and hence allows torsion. There are 
two ways to treat the level of coupling of torsion; one can consider torsion on the 
same level as symmetric affine connection (MEEP JI) or one can consider torsion 
on the same level as curvature tensors (MEEP II). Hehl and von der Heyde*® 
hold the second point of view. MEEP I allows torsion. Since torsion is a tensor, 
it cannot be transformed away in any frame if it is not zero. EEP is equivalent to 
MEEP I plus no torsion; therefore we have EEP implies MEEP I but MEEP I does 
not implies EEP. For a test body, curvature effects are neglected; so MEEP II is 
essentially equivalent to EEP for test bodies. Test bodies with nonvanishing total 
intrinsic spin feel torques from the torsion field. Hence MEEP I does not imply 


and in some 
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WEP II. Moreover MEEP I does not imply WEP I either.*° Therefore we have the 
following: 

EEP > MEEFP I 

‘We 4 

WEP II > WEP I 


“WEP II implies EEP is proved for an electromagnetic system in y—g frame- 
work.?°:2!:41 However, for other frameworks, the issue is still open. 


2.5. Equivalence principles including gravity (Strong equivalence 
principles) 


How does gravitational energy behave in a gravitational field? Is local gravity exper- 
iment depending on where and when in the universe it is performed? These involve 
nonlinear gravity effects. 


(i) WEP I for massive bodies 


This weak equivalence principle says that in a gravitational field, the trajectory of 
a massive test body with a given initial velocity is also independent of the amount 
of gravitational self-energy inside the massive body. In Brans—Dicke theory and 
many other theories, there are violations of this equivalence principle. The viola- 
tions are called Nordtvedt effects.4?-43 General relativity obeys WEP I for massive 
bodies in the post-Newtonian limit and for black hole solutions. The nonexistence of 
Nordtvedt effects is an efficient way to single out purely metric theory among metric 
theories of gravity (those comply with EEP). From lunar laser ranging experiment 
and binary pulsar timing observations, the Nordtvedt effect is limited. 


(ii) Dicke’s** strong equivalence principle 


This is a microscopic equivalence principle. It says that the outcome of any local 
test experiment — gravitational or nongravitational — is independent of where and 
when in the universe it is performed, and independent of the velocity of the appa- 
ratus. If this equivalence principle is valid, the Newtonian gravitational constant 
Gy should be a true constant. Brans—Dicke theory with its variable “gravitational 
constant” as measured by Cavendish experiments satisfies EEP but violates SEP. 
Also, if this equivalence principle is valid, a self-gravitating system in background 
with length scale much larger than the self-gravitating system should have locally 
Lorentz invariance in the background, e.g. no preferred-frame effects.4°4° 

The violations of SEP seem to be linked with the violations of WEP I for massive 
bodies in many cases. It is interesting to know how SEP and WEP I for massive 
bodies are connected. The violations of SEP may also be connected to the violations 
of WEP I at some level in some cases. 

We note in passing that there are other versions of equivalence principles which 
we are not able to list them here one-by-one. For recent discussions on equivalence 
principles, see also Refs. 47 and 48. 
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2.6. Inequivalence and interrelations of various equivalence 
principles 


In the preceding subsections, we have listed and explained various equivalence prin- 
ciples. Logically all these equivalence principles are different. An important issue is 
that to what extent they are equivalent, and in what situations they are inequiv- 
alent. This issue became conspicuous for more than 50 years since Dicke—Schiff 
redshift controversy. In 1960, Schiff*? argued as follows: “The Eétvés experiments 
show with considerable accuracy that the gravitational and inertial masses of nor- 
mal matter are equal. This means that the ground state eigenvalue of the Hamil- 
tonian for this matter appears equally in the inertial mass and in the interaction 
of this mass with a gravitational field. It would be quite remarkable if this could 
occur without the entire Hamiltonian being involved in the same way, in which 
case a clock composed of atoms whose motions are determined by this Hamiltonian 
would have its rate affected in the expected manner by a gravitational field.” He 
suggested that EEP and, hence, the metric gravitational redshift are consequences 
of WEP I. In short, Schiff believes that 


WEP I > EEP. 


This conjecture is known as Schiff’s conjecture. The scope of validity of Schiff’s 
conjecture has great importance in the analysis of the empirical foundations of 
EEP. 

However Dicke®® held a different point of view and believed that the redshift 
experiment has independent theoretical significance. In November 1970, the inter- 
ests in the issue of the validity of Schiff’s conjecture were rekindled during a vigorous 
argument between Schiff and Thorne at the Caltech-JPL Conference on Experimen- 
tal Tests of Gravitation Theories. In 1973, Thorne, Lee and Lightman®! analyzed 
the fundamental concepts and terms involved in detail and gave a plausibility argu- 
ment supporting Schiff’s conjecture. Lightman and Lee®? proved Schiff’s conjec- 
ture for electromagnetically interacting systems in a static, spherically symmetric 
gravitational field using the THey formalism. I found a nonmetric theory which 
includes pseudoscalar—photon interaction and showed that it is a counterexample 
to Schiff’s conjecture.°? In 1974, I showed that this counterexample is the only 
case in a general premetric constitutive tensor formulation of electromagnetism (x- 
framework) with standard particle Lagrangian (The whole framework is called the 
x —g framework.).?°?! This supports that the approach of Schiff is right in the 
large, although not completely right. In the eikonal approximations of the y — g 
framework, I showed that the first-order gravitational redshifts are metric?! (so 
Schiff was right for redshift in this case to first-order). In the latter part of 1970s, I 
use the x — g framework to look into the issue of gravitational coupling to electro- 
magnetism empirically.'4~1°4° In the next section, we will review the progress for 
this issue. Recently, the significance of redshift experiments is brought up again in 
the comparison of redshift and atom interferometry experiment.°+°° 
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For the SEP, one could ask similar questions. Would WEP I for massive body 
imply Dicke’s SEP? This is a direct extension of Schiff’s conjecture. One can call 
it Schiff’s conjecture for massive bodies. There are significant progresses recently. 
Gérard®® has worked out a link between the vanishing of Nordtvedt effects and 
a condition of SEP. Di Casola, Leberati and Sonego®’ have employed WEP I for 
massive bodies as a sieve for purely metric theories of gravity using variational 
approach. They also propose the conjecture that SEP is equivalent to the union 
of WEP I for massive bodies (GWEP in their term) and EEP. Since WEP I does 
not imply EEP (Schiff’s conjecture is incorrect) ,?°:24°3 we would like to propose to 
investigate the validity of the following two statements in various frameworks: (i) 
WEP II for massive bodies is equivalent to Dicke’s SEP; (ii) SEP is equivalent to 
the union of WEP II for massive bodies (GWEP in their term) and EEP. 


3. Gravitational Coupling to Electromagnetism and the Structure 
of Spacetime 


3.1. Premetric electrodynamics as a framework to study 
gravitational coupling to electromagnetism 


For the ordinary gravitational field, it is a low energy situation compared to Planck 
energy, aS we mentioned in Sec. 1. If we represent the gravitational coupling to 
electromagnetism by constitutive tensor density, the constitutive tensor density 
must be linear and local as given by (12), independent of the field strength Fy, 
dependent only on the gravitational field(s). The constitutive tensor density (12) has 
three irreducible pieces. Both H” and Fi are antisymmetric, hence y¥*! 
antisymmetric in i and j, and k and I. Therefore the constitutive tensor density y 
has 36 (6 x 6) independent components. A general linear constitutive tensor density 
x"! in electrodynamics can first be decomposed into two parts, the symmetric part 
in the exchange of index pairs #j and kl [(1/2)(x"’ + y*"7)| and the antisymmetric 
part in the exchange of index pairs ij and kl [(1/2)(y¥*" — y*")]. The first part has 
21 degrees of freedom and contains the totally antisymmetric part — the axion part 
(Ax). Subtracting the axion part, the remaining part is the principal part which 
has 20 degrees of freedom. The second part is the skewon part and has 15 degrees 
of freedom. The principal part (P), the Abelian axion part (Ax) and the Hehl- 
Obukhov—Rubilar skewon part (Sk) constitute the three irreducible parts under 
the group of general coordinate transformations’: 


must be 
ijkl 


yoke = (P), ak 4 (Sk), igh 4 (Ax), jk (ye = —xyiikl = ati) (26) 

with 
(P), ak = @ [2(. hd) (xi x") (x4a* xh), (27a) 
(Ax), igh _ lise] _ pedr (27b) 


(8k) ijkl (5) (x kl a J), (27c) 
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Decomposition (26) is unique. If we substitute (26) into (17a), the skewon part 
does not contribute to the Lagrangian; hence, for Lagrangian based theory, it is 
skewonless. The systematic study of skewonful cases started in 2002 (see, e.g. Ref. 6). 

The complete agreement with EEP for photon sector requires (as locally in 
special relativity) (i) no birefringence; (ii) no polarization rotation; (iii) no ampli- 
fication/no attenuation in spacetime propagation. In Secs. 3.2-3.5, we review how 
cosmic connection/observation of these three conditions on electromagnetic propa- 
gation verifies EEP and determination of the spacetime structure in the skewonless 
case (Lagrangian—based case). In Sec. 3.2, we derive wave propagation and disper- 
sion relations in the lowest eikonal approximation in weak field in the premetric 
electrodynamics. In Sec. 3.3, we apply it to the determination of the spacetime 
structure in the skewonless case using no birefringence condition. With no birefrin- 
gence, any skewonless spacetime constitutive tensor must be of the form 


where h’ is a metric constructed from y"(h = det(hi;) and hij the inverse of h’’) 
which generates the light cone for electromagnetic wave propagation, w a dilaton 
field constructed from x“! ae 
Observations on no birefringence of cosmic propagation of electromagnetic waves 
constrain the spacetime constitutive tensor to the form (28) to very high precision. 
In Sec. 3.4, we review the derivation of the dispersion relation of wave propagation 
in dilaton field and axion field with constitutive relation (28); we show further that 
with the condition of no polarization rotation and the condition of no amplifica- 
tion/no attenuation satisfied, the axion y and the dilaton w~ should be constant, i.e. 
no varying axion field and no varying dilaton field respectively. The EEP for pho- 
ton sector would then be observed; the spacetime constitutive tensor density would 
be of metric-induced form. Thus we tie the three observational conditions to EEP 
and to metric-induced spacetime constitutive tensor density in the photon sector. In 
Sec. 3.5, we review the empirical constraint on cosmic dilaton field and cosmic axion 
field. The results are summarized in Table 1. In Sec. 3.6, we apply the dispersion 
relations derived in Sec. 3.2 to the case of metric induced constitutive tensor with 
skewons with further discussions. In Sec. 3.7, we discuss the case of spacetime with 
asymmetric-metric induced constitutive tensor using Fresnel equation. In Sec. 3.8, 
we review the application of these results to the accuracy of empirical verification 
of the closure relations in electrodynamics. 


and y an Abelian axion field constructed from y 


3.2. Wave propagation and the dispersion relation 


The sourceless Maxwell equation (10b) is equivalent to the local existence of a 
4-potential A; such that 


Bye = Aggy — Auge (29) 
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with a gauge transformation freedom of adding an arbitrary gradient of a scalar 
function to A;. The Maxwell equation (10a) in vacuum is 


(x Ag,t),9 = 0. (30) 
Using the derivation rule, we have 
x Anas + x Ag = 0. (31) 


(i) For slowly varying, nearly homogeneous field/medium, and/or (ii) in the eikonal 
approximation with typical wavelength much smaller than the gradient scale and 
time-variation scale of the field/medium, the second term in (31) can be neglected 
compared to the first term, and we have 


x" Ag ay = 0. (32) 


This approximation is the lowest eikonal approximation, usually also called the 
eikonal approximation. In this approximation, the dispersion relation is given by 
the generalized covariant quartic Fresnel equation (see, e.g. Ref. 6; also Sec. 3.7). It 
is well-known that axion does not contribute to this dispersion relation®:!4~ 1658-61 
as we will see in the following. In this subsection, we use this lowest eikonal approx- 
imation and follow Ref. 62 to derive dispersion relation in the general linear local 
constitutive framework. In Sec. 3.4, we keep the second term of (31) and follow 
Ref. 63 to find out dispersion relations for the case that the dilaton gradient and 
the axion gradient cannot be neglected. 
In the weak field or dilute medium, we assume 


kt = ign! 4 nage +4 O(2), (33) 


where O(2) means second-order in x“). Since the violation from the EEP would be 
small and/or if the medium is dilute, in the following we assume that 


ig 1) Al od bY ation 
ee = (5) g'* gi — (5) gg”, (34) 


and (9! is small compared with x74", We can then find a local inertial frame 
such that g‘? becomes the Minkowski metric 7’! good to the derivative of the metric. 
To look for wave solutions, we use eikonal approximation and choose z-axis in the 
wave propagation direction so that the solution takes the following form: 


A='(Ay Ay An, Age™. (35) 
We expand the solution as 
A=|A +A cooe (36) 


Imposing radiation gauge condition in the zeroth-order in the weak field/dilute 
medium/weak EEP violation approximation, we find the zeroth-order solution of 
(36) and the zeroth order dispersion relation satisfying the zeroth-order equation 
x Oriset AO? = 0 as follows: 


A = (0, At, AM,0), w=k+O(I). (37) 
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Substituting (36) and (37) into Eq. (32), we have 
ac ORE AD + OHH ALD = 0+ O(2). (38) 
The i = 0 and 7 = 3 components of (38) both give 
Am) ne AY _ 2((1)3018 _ (13010) 4(0) wh 2 ((1)3023 _ x (1)3020) 4 (0) +0(2). (39) 


Since this equation does not contain w and k, it does not contribute to the 
determination of the dispersion relation. A gauge condition in the O(1) order fixes 
the values of AM and AY 

The i = 1 and i = 2 components of (38) are 


(5) (w? — KA?) 4 OLR AO) 4 Otsk A) = 0+0(2), (40a) 


i 
= (5) i? = k2) AO 4 ge A 4 go a =0+0(2). (40b) 


These two equations determine the dispersion relation and can be rewritten as 


1 
(5) =H) — Ay] AO — BAM = 0—), (Ala) 
1 
BAO + (5) (w? — k2) KA | AY = 0(2), (41b) 
where 
Any = y(1)1010 _ (,(4)1018 4, (4)13810 (1)1313 42 
a) =X (x tx )+x , (42a) 
Aw =x (12020 _(y(1)2023 4 (1)2820) 4 (1)2323. (42b) 
Bay =x (1)1020 _ ((1)1023 4. (1)1320) 4 (1)1323 (42c) 
Ba) = y(1)2010 _ ( )2013 Ly (1)2810) | pay eats, (42d) 


We note that Aj) and A(z) contain only the principal part of x; Bay and By) 
contain only the principal and skewon part of x. The axion part drops out and does 
not contribute to the dispersion relation in the eikonal approximation. The principal 
part ‘P)B and skewon part SB of Bi) are as follows: 


1 1 
B= (5) (Bay + Ba); S8B= (5) (Bay — Bay). (43) 
From (43), Bij) and Bi) can be expressed as 


Bay =" B+S"B; Ba =B- Ss. (44) 
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For Eqs. (41a), (41b) to have nontrivial solutions of (AM, AW), we must have 
the following determinant vanish to first-order: 


1 
4 (5) (w? = k?) = k? Aq) —k? Bay 
et 


1 
—k? Bry (5) (w? = Be) = k? Aa) 


1 1 
7 (;) (0? — a) Gt (Aa) + Ap) 
+k*(Aq)Aqy) — Bay Bey) = 0 + O(2). (45) 


The solution of this quadratic equation in w?, i.e. the dispersion relation is 


we = k{1 + (Aq) + Ac2)) au ((Aq) _ Ay)” + 4B) Byy)'!?] + O(2), (46) 


or 


1 1 
w=kl1l+ 34a = Ay) x 5 (Aaa) Av)” 4B) Bi)!” +O(2). (47) 


From (46) the group velocity is 
Ow 1 1 
Ug = Bp = 1+ 5 (Aq t+ Ae) + 5 (Aq) - Ae)? + 4B Bey)? +02). (48) 


The quantity under the square root sign is 


€ = (Aq - Aq)? + 4Ba) Ba) = (Aq) — Ag)? + 4(BY — 4(8"B)?. (49) 


Depending on the sign or vanishing of €, we have the following three cases of 
electromagnetic wave propagation: 


(i) € > 0,(Aq) — Ai)? + 4(B)? > 4(S"B)?: There is birefringence of wave 
propagation; 

ii) €=0, (Aq) — Avy)? +4(B)? = 4(SB)?: There are no birefringence and no 

(ii) € = 0, (Ag) — Ae) . 
dissipation/amplification in wave propagation; 

iii) € < 0,(Aq) — Aay)? + 4(B)? < 4(SB)?: There is no birefringence, but 

(1) (2) 

there are both dissipative and amplifying modes in wave propagation. 


3.2.1. The condition of vanishing of Bay and Biz) for all directions of wave 
propagation 


From the definition (42c), the condition of vanishing of Bj) for wave propagation 
in the z-axis direction is 


Ba = (1) 1020 a y (1) 1828 = (1) 1028 = y(1)1820 iG: (50) 


To look for conditions derivable in combination with those from other directions, 
we do active Lorentz transformations (rotations/boosts). Active rotation R» in the 
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y — z plane with angle @ is 


t=Ret=t, xz=Roxr, y=Roy=ycosé@+zsind, z=Rez= —ysin#+ zcosd. 


(51) 
Applying active rotation Rg (51) to (50), we have 
0 = (2020 +(1)1828 _ {11028 = (11820 
= (11020 4 (1)1323 _ ,(2)1023 __ (1)1820 
+ G(s1)2030 4. y/(1)1220 _  (1)1223 _ ,(1)1330) 4 Q(g2), (52) 
for small value of 6. From (52) and (50), we have 
(11030 4 (1)1220 _ ,(2)1223 __ (11330 _ 9, (53) 


Following the same procedure, we repeatedly apply active rotation Rg to (53) and 
the resulting equations together with their linear combinations. After performing 
cyclic permutation 1 — 2 — 3 — 1 on the upper indices once and twice on some 
of the resulting equations, we have the following equations (for detailed derivation, 
see arXiv:1312.3056v1) 


(1220 — (11330, (54a) 
(2330 _ y(1)2110, (54b) 
P3110 — 1 (1)3220, (54c) 
(P1020 _ _ (11323, (54d) 
(2030 __(1)2131, (54e) 
(3010 __y (13212, (54f) 
(11320 _ __ (11230, (54g) 
(3210 _ (13120, (54h) 
(2130 _ _(1)2310, (54i) 
(1023 — __ (1820. (54j) 
(2081 _ _(1)2130, (54k) 
(03012 (1)3210_ (541) 


From (54g)—(541), vy)! is completely anti-symmetric under any permutation 
of (0123). Among (54g)—(54i) only two are independent; among (54j)—(541) also only 
two are independent. For (P4)y*", (54g¢)—(541) give two independent conditions. For 
(Sk) kl (54¢)—(541) give three independent conditions and y“°!?3 must vanish. 

The derivation of formulas in this subsection from (50) to (541) is independent 
of whether y“"" is principal, axionic or skewonic. Hence, ‘P)(54a)—)(541) hold for 
(Pak with MBq) = 0, A)(54a)-A) (541) hold for Ay" with “) By) = 0, and 
(SK) (54a)—(S®) (541) hold for Sy" with SB) = 0. Here ‘P)(54a)—©) (541) means 
(54a)-(541) with y substituted by )y, (4) (54a)—(4) (541) means (54a)—(541) with x 
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substituted by ‘y, and ‘S") (54a)—(S*) (541) means (54a)—(541) with y substituted by 
(Sk)y- similarly for Ba), Bay and (SI)B i). For Bi) = Bi) = 0 in all directions, 
we have Buy = (SI)B 4) = 0 in all directions, and hence, both ‘)(54a)—) (541) 
and (S) (54a)—(S*)(541) are valid. 


3.2.2. The condition of (SEB) = Bi) = 0 and Aq) = Ay for all 
directions of wave propagation 


With the condition (SB) = Bay = 0 and Aq) = Av) for all directions of 
wave propagation, there is no birefringence for all directions of wave propagation. 
From section 2.2.1, we have Eqs. (54a)—(541) holds from the validity of S9 Ba) = 
Ba) = 0 (ie. By) = 0) for all directions of wave propagation. From A) = A) 
and the definition (42a), (42b), we have 


i010 = (P1018 he (11310) 4 y(p1313 


= (12020 _ (9¢(1)2028 ae y(1)2320) 4 yy (12823 (55) 


From (54c) for the principal part, the terms in the parentheses on the two sides of 
the above equation cancel out and we have 


(1010 i y)1313 = y(1)2020 fe 3 (1)2328 (56a) 


Applying active rotation R,/2 around in the y — z plane to (56a), we obtain 


yP1o10 « yaaa _ y(1)3030 + (13232 (56b) 


3.3. Nonbirefringence condition for the skewonless case 


If EEP is observed, photons with different polarizations as test particles shall follow 
identical trajectories in a gravitational field. Then the photons obey WEP I and 
there is no birefringence. In this section, we will first derive the core metric formula 
for the constitutive tensor density from the nonbirefringence condition in the ske- 
wonless case (Lagrangian-based case) and then use the cosmological observations 
to constrain the spacetime constitutive tensor density to this form to ultra-high 
precision. 
From Eq. (49), the condition of nonbirefringence in the skewonless case is 


Pp 
Aa) =A), Bay = Bay = B =0. (57) 


With these conditions, (54a)—(54h) and (56a), (56b) are valid and give 10 conditions 


on 21 independent components of skewonless constitutive tensor density y": 
(11220 —. (11330, (58a) 
(12330 _  (1)2110, (58b) 


(D310 (1)3220, (58c) 
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(1020 __ 4 (2)1823, (58d) 
(2030 _ _(1)2131, (58e) 
(P3010 __ (28212, (58f) 
x! )1320 _ —y(1)1280, (58g) 
x! )3210 _ — (13120, (58h) 

y(D1o10 oi x )1313 _ xy(1)2020 4 ge (581) 

y()1010 + x 1212 _ (1)3030 iis (13232 (58}) 


Define 
p10 = A(1)01 = _9(P),(1)1220, 79 (1)20 = p(1)02 = _9(P), (12880. 
j1)30 = p03 = _9(P) (300, 
p12 = p21 = —9(P)(1)1020, 7 (1)28 = p(1)82 = _9(P),(1)2080, 
ps 1 p13 = —2P y sam. (59a) 
AM = gP)y(1)2020 4 o(P)(1)2121 __ 7,(1)00, 


p(22 = 9(P)y (1)3030 is a(P)y (1)3232 __p,(1)00. 


p(1)33 = 9(P)y(1)1010 ie 2P)y (1)1313 __7,(2)00 


v= 1+ 2@)y(11212 (5) noo (kh) ptt _ 7(1)22 p(1)33) 


_ eos _ Aer. (59b) 


y= y()o12s = y()[0123] (59c) 


Note that in these definitions, h“) is not defined and is free. Now it is straight- 
forward to show that when (58a)—(58j) are satisfied, then y can be written to first- 
order in terms of the fields h), a, and y with hY =n +h and h = det(h;;) 
in the following form: 


ijkl _ (P)y (ight 4 (A)y Qighl rs (SkID), (1)ighd 


xX 
= 5th) [hint _ he ANI |p ue pet (60) 
with 
(P) Caged _ al _ hen Vp, (61a) 
(A) (DIF — petit, (61b) 


It is ready to derive the following theorem. 


Theorem. For linear electrodynamics with Lagrangian (17a), i.e. with skewonless 
constitutive relation (12), the following three statements are equivalent to first-order 
in the field: 


(i) Ag) = Av) and (\B = 0 for all directions, i.e. nonbirefringence in electro- 
magnetic wave propagation, 
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(ii) (58a)-(58}) hold, 
(iii) y¥" can be expressed as (60) with (59a)—(59c). 


Proof. (i) = (ii) has been demonstrated in the derivation of (58a)—(58j). 

(ii) = (iii) has also been demonstrated in the derivation of (60) above. 

(iii) + (i) Equation (60) is a Lorentz tensor density equation. If it holds in 
one Lorentz frame, it holds in any other frame. From this we readily check that 
Aq) = Ai) and ®)B = 0 in any new frame with the wave propagation in the 


zdirection. 


This theorem is a re-statement of the results of our work.!4~!© We note that 
previously we used the symbol H“* instead of h**. Because H** is already used for 
excitation in this paper, we changed the notation. 

We constructed the relation (60) in the weak-violation approximation of EEP 
in 1981 (Refs. 14-16); Haugan and Kauffmann®® reconstructed the relation (60) 
in 1995. After the cornerstone work of Laimmerzahl and Hehl,°® Favaro and 
Bergamin® finally proved the relation (60) without assuming weak-field approx- 
imation (see also Ref. 65). 

Polarization measurements of electromagnetic waves from pulsars, from cosmo- 
logically distant radio sources and from GRB sources have yielded stringent con- 
straints agreeing with (60) down to 10~'®, 10-7 and 10~°® respectively as shown 
in Table 1. 

Observational constraints from pulsars'**!®: In 1970s and 1980s pulsar observa- 
tions gave the best constraints on the birefringence in the propagation. The pulses 
and micropulses from pulsars with different polarizations are correlated in gen- 
eral structure and timing.®° No retardation with respect to different polarizations 
is observed. This means that conditions similar to (57) are satisfied to observa- 
tional accuracy. For Crab pulsar, the micropulses with different polarizations are 
correlated in timing to within 10~*s, the distance of the Crab pulsar is 2200 pc, 
therefore to within 10~+s/(2200 x 3.26 light yr.) = 5 x 10~1° accuracy, two con- 
ditions similar to (57) are satisfied. In 1981, over 300 pulsars in different direc- 
tions had been observed. Many of them had polarization data. Combining all of 
them, (58a)—(58j) were satisfied to an accuracy of 10-14 — 10~!°. Since for galactic 
gravitational field U ~ 107°, according to the procedure of proving the theorem, 
x /U (or x/U) agrees with that given by (60) to an accuracy of 10-8 — 107°. 
At that time, we anticipated that detailed analysis would reveal better results. 
In 2002, a detailed analysis using X-ray pulsars®’ demonstrated the full proce- 
dure. At that time McCulloch, Hamilton, Ables and Hunt®* had just observed a 
radio pulsar in the large Magellanic Cloud; Backer, Kulkarni, Helles, Davis and 
Goss®? had discovered a millisecond pulsar which rotates 20 times faster than the 
Crab pulsar. The progress of these observations would potentially give better con- 
straints on some of the conditions (58a)—(58j) due to larger distance or fast period 
involved. 
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We also anticipated that analysis of optical and X-ray polarization data from 
various astrophysical sources would give better accuracy to some of the 10 con- 
straints in (58a)—(58)). 

Thus, to high accuracy, photons are propagating in the metric field h’* and two 
additional (pseudo)scalar fields 7 and y. A change of h’* to Ah"* does not affect 
x¥* in (60) — this corresponds to the freedom of h“ in the definition (59a) of 
nh, Thus we have constrained the general linear constitutive tensor of 21 degrees 
of freedom from the 10 constraints (58a)—(58j) to 11 degrees of freedom in (60). 

Constraints from extragalactic radio-galaxy observations®!: Analyzing the data 
from polarization measurements of extragalactic radio sources, Haugan and 
Kauffmann*® in 1995 inferred that the resolution for null-birefringence is 0.02 cycle 
at 5 GHz. This corresponds to a time resolution of 4x 10~!* s and gives much better 
constraints. With a detailed analysis and more extragalactic radio observations, (60) 
would be tested down to 10~78 — 10-29 at cosmological distances. In 2002, Kost- 
elecky and Mews” used polarization measurements of light from cosmologically 
distant astrophysical sources to yield stringent constraints down to 2 x 107%. The 
electromagnetic propagation in Moffat’s nonsymmetric gravitational theory’? fits 
the y—g framework. Krisher,’? and Haugan and Kauffmann®® have used the pulsar 
data and extragalactic radio observations respectively to constrain it. 

Constraints from gamma ray burst observations'’: Recent polarization obser- 
vations on GRBs give even better constraints on the dispersion relation and non- 
birefringence in cosmic propagation.”+”> The observation on the polarized GRB 
061122 (z = 1.33) gives a lower limit on its polarization fraction of 60% at 68% 
confidence level (c.1.) and 33% at 90% c.l. in the 250-800keV energy range.”* The 
observation on the polarized GRB 140206A constrains the linear polarization level 
of the second peak of this GRB above 28% at 90% c.l. in the 200-400 keV energy 
range’°; the redshift of the source is measured from the GRB afterglow optical 
spectroscopy to be z = 2.739. GRBs polarization observations have been used to 
set constraints on various dispersion relations (see, e.g. Refs. 76 and 77 and ref- 
erences therein). These two new GRB observations have larger and better redshift 
determinations than previous observations. We use them to give better constraints 
in our case. Since birefringence is proportional to the wave vector k in our case, 
as gamma ray of a particular frequency (energy) travels in the cosmic spacetime, 
the two linear polarization eigen-modes would pick up small phase differences. A 
linear polarization mode from distant source resolved into these two modes will 
become elliptical polarized during travel and lose part of the linear coherence. The 
way of gamma ray losing linear coherence depends on the frequency span. For a 
band of frequency, the extent of losing coherence depends on the distance of travel. 
The depolarization distance is of the order of frequency band span tAf times the 
integral J = [ (1+ z(t))dt of the redshift factor (1 + z(¢)) with respect to the time 
of travel. For GRB 140206A, this is about 


nAfI =nAf fa + 2(t))dt ~ 1.5 x 107° Hz x 0.6 x 1018s = 10°. (62) 
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Since we do observe linear polarization in the 200-400 kHz frequency band of GRB 
140206A with lower bound of 28%, this gives a fractional constraint of about 107%° 
on a combination of y’s. A similar constraint can be obtained for GRB 061122 
(the band width times the redshift is about the same). A more detailed modeling 
may give better limits. The distribution of GRBs is basically isotropic. When this 
procedure is applied to an ensemble of polarized GRBs from various directions, the 
relation (20) would be verified to about 1078. 

Thus, we see that from the pulsar signal propagation, the polarization observa- 
tions on radio galaxies and the GRB observations the nonbirefringence condition 
is verified empirically in spacetime propagation with accuracies to 10~!°,10~% 
and 10-88. The accuracies of three observational constraints are summarized in 
Table 1. The constitutive tensor can be constructed by the procedure in the proof 
of the theorem in this subsection to be in the core form (60) with accuracy to 10~°°. 
Nonbirefringence (no splitting, no retardation) for electromagnetic wave propaga- 
tion independent of polarization and frequency (energy) is the statement of Galileo 
Equivalence Principle for photons or WEP I for photons. Hence WEP I for photons 
is verified to this accuracy in the spacetime propagation. 

In the following subsection, we assume (60) (i.e. (28)) is valid and look into the 
influence of the axion field and the dilaton field of the constitutive tensor on the 
dispersion relation. 


3.4. Wave propagation and the dispersion relation in dilaton field 
and axion field 


We first notice that in the lowest eikonal approximation, the dispersion relation 
(46) or (47) does not contain the axion piece and does not contain the gradient of 
fields. Dilaton in (60) goes in this dispersion relation only as an overall scale factor 
and drops out too. 

To derive the influence of the dilaton field and the Abelian axion field on the 
dispersion relation, one needs to keep the second term in Eq. (31). This has been 
done for the axion field in Refs. 53, 60, 61, 78-80. Here we follow the treatment in 
Ref. 63 to develop it for the joint dilaton field and axion field with the constitutive 
relation (60). Near the origin in a local inertial frame, the constitutive tensor density 
in dilaton field ~ and axion field y (Eq. (60)) becomes 


x(a") = | (5) atta — (5) nth] wa) + oe™)e™ + OF e'x!), (63) 


where 7 is the Minkowski metric with signature —2 and 6;; the Kronecker delta. 
In the local inertial frame, we use the Minkowski metric and its inverse to raise and 
lower indices. Substituting (63) into the Eq. (31) and multiplying by 2, we have 


pA I;,— pA *;+,; At? — b,j 49,” +20, 7e8" Ap = 0. (64) 


We notice that (64) is both Lorentz covariant and gauge invariant. 
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We expand the dilaton field w(a™) and the axion field y(a™) at the 4-point 
(event) P with respect to the event (time and position) Po at the origin as follows: 


b(a™) = (Po) + bys (Po)? + O(6,j2*2"), (65a) 
p(x™) = v(Po) + i (Poa? + O(6ij;2'2"). (65b) 


To look for wave solutions, we use eikonal approximation which does not neglect 
field gradient /medium inhomogeneity. Choose z-axis in the wave propagation direc- 
tion so that the solution takes the following form: 


A = (Ao, Ai, Ag, As) = (Ap, Ap Ay Ae = Ages, (66) 
Expand the solution as 
A; = A + AM) + O(2) = [A A LOO = A gee, (67) 


Now use eikonal approximation to obtain a local dispersion relation. In the 
eikonal approximation, we only keep terms linear in the derivative of the dilaton 
field and the axion field; we neglect terms containing the second-order derivatives 
of the dilaton field or the axion field, terms of O(5;;2'x) and terms of mixed 
second-order, e.g. terms of O( A 24) or O(AM ,; ); we call all these terms O(2). 

Imposing radiation gauge condition in the zeroth-order in the weak field/dilute 
medium approximation, we find to zeroth-order, (65) is 

(O)ig __ (O)ij __ 

pA’; =0 or A’; =0, (68) 

and the corresponding zeroth-order solution and the dispersion relation are 
A = (0, A®) , A) 0) - A) thet = (0,4) ,.A™, dee, (69a) 
w=k+O(1). (69b) 

Substituting (68) and (69a), (69b) into Eq. (64), we have 
PAS + pA’ + pA’, + 9p gA OF — pg AO +29, 507A, , = 0+ 0(2). 

(70) 

The 7 = 0 and i = 3 components of (70) both lead to the same modified Lorentz 
gauge condition in the dilaton field and the axion field in the O(1) order®?: 

AM? , = Wha — 29,2)AO — wl (Wa + 29a)AY? +.0(2). (71) 
Since Eq. (71) does not contain w and k, it does not contribute to the determination 
of the dispersion relation. 


Using the gauge condition (71), we obtain the i = 1 and 7 = 2 components of 
Eq. (70) as 


(w? — k2) A — kA DA (Wo + Y3) — kA YW" (y0 + y,3) = 0 + O(2), 


(w? — k2) AM — tkAQ DH (ho + V3) + KAO WY (Y0 + 9,3) = 0+ O(2). 
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These two equations determine the dispersion relation in the dilaton field and 
the axion field: 


sit (w? — k?) — ik 1 (tho + 3) —2iky-"(v 0+ 9,3) 
2ikb "(yo + v3) (w? — k?) — ikw* (0 + W3) 
= [(w? — k?) —ikb "(ho + va)? — 4k? 7(90 + 9,3)? = 0 + O(2). 


(73) 


Its solutions are 


—s (5) UUbotts)tdUvotes) +02) or —— (7Aa) 


k=wt (5) w (do + 3) + wo "(Yo + Y,3) + O(2), (74b) 


with the group velocity vy = Ow/Ok = 1 independent of polarization. When the 
dispersion relation is satisfied, (72a) and (72b) have two independent solutions for 


the polarization eigenvectors A”) - (A, A, A, A) with 


AM [ik * (Yo + ¥,3)] _ Bikbwotva)l _ 4, — 
AO, [(W? —F) thd "o+da)] [42k o+ys)] 
AS = AS = 0, (75b) 


for w = k—(i/2)b-! (wo +¥3)£07!(~.0+¥,3)+ O(2) respectively. From (75a), the 
two polarization eigenstates are left circularly polarized state and right circularly 
polarized state in varying axion. This agrees with and generalizes the electromag- 
netic wave propagation in axion field as derived earlier.°?:60:6178—80 

With the dispersion (74), the plane-wave solution (66) propagating in the 
z-direction is 


A = (Ao, Ai, Ag, As) = (0, A® A , jee = (0, A® , A, 0) 


x exp|ikz — ikt + (—i)b (pot + 32) — (5) Ww (bot +v,32)|, (76) 


with A”) es iA, The additional factor acquired in the propagation is 
exp[+(—i)b~!(y,0t + ¥,32)] x exp[—(1/2)b~ (wot +Y,32)]. The first part of this fac- 
tor, i.e. the axion factor exp[+(—i)v~!(yot+,32)] adds a phase in the propagation. 
The second part of this factor, i.e. the dilaton factor exp[—(1/2)%~! (wot + v,32)| 
amplifies or attenuates the wave according to whether ({w,ot + v.32) is less than 
zero or greater than zero. For the right circularly polarized electromagnetic wave, 


the effect of the axion field in the propagation from a point P, = {245} — 
{ru}i tay} = {ea}, 2a}; 2a}, 25} to another point P, = {x(9)} = {2 (o}3 2 (oy } = 
{9} (9) Z(g)» Bg} } is to add a phase of a = Y~*[p(Pa) — p(Pi)|(® v(P2) — o(Pr) 
for w & 1) to the wave; for left circularly polarized light, the effect is to add an 
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opposite phase.°?:69.61;78—89 TL inearly polarized electromagnetic wave is a superposi- 
tion of circularly polarized waves. Its polarization vector will then rotate by an angle 
a. The effect of the dilaton field is to amplify with a factor exp[—(1/2)~!(w ot + 
w,32)] = exp[—(1/2)((In) ot + (Inw),3z)] = (¢)(P1)/(P2))/?. Whether the dila- 
ton field amplifies or attenuates the propagating wave depends on w(P,;)/(P2) >1 
or (P,)/w(P2) < 1 respectively. 

For plane wave propagating in direction n“ = (n',n?,n3) with (n!)? + (n?)? + 
(n3)? = 1, the solution is 


A(n") = (Ao, Ai, A2, Ag) = (0, A;, Ag, Az) exp(—ikn“ax,, — iwt) 


_ (0, Ai, Ao, As) exp isn", — ikt + (-1)b~*(Y,0 t— On Nyx’) 


1 
= (5) wb" (do iP CML nc) ? (77) 
where A, = A, + nyn’ AO) with A”) = +i A and AM = 0[n, = 


(—n', —n?, —n3)]. There are polarization rotation for linearly polarized light due to 
axion field gradient, and amplification/attenuation due to dilaton field gradient. 

The above analysis is local. In the global situation, choose local inertial frames 
along the wave trajectory and integrate along the trajectory. Since w is a scalar, 
the integration gives (w(P,)/(P2))!/? as the amplification factor for the prop- 
agation in the dilaton field. For small dilaton field variations, the amplifica- 
tion/attenuation factor is equal to [1 — (1/2)(Ay/w)] to a very good approximation 
with Aw = w(P2) — v(P,). Since this factor does not depend on the wave num- 
ber/frequency and polarization, it will not distort the source spectrum in propa- 
gation, but gives an overall amplification/attenuation factor to the spectrum. The 
axion field contributes to the phase factor and induces polarization rotation as in 
previous investigations.°%:69.!;78—8° For w ~ 1 (constant), the induced polarization 
rotation agrees with previous results which were obtained without considering dila- 
ton effect. If the dilaton field varies significantly, a q-weight needs to be included 
in the integration. 

The complete agreement with EEP for photon sector requires in addition 
to Galileo equivalence principle (WEP I; nonbirefringence) for photons: (i) no 
polarization rotation (WEP IJ); (ii) no amplification/no attenuation in spacetime 
propagation; (iii) no spectral distortion. With nonbirefringence, any skewonless 
spacetime constitutive tensor must be of the form (60), hence no spectral distor- 
tion. From (60), (i) and (ii) imply that the dilaton w and axion y must be constant, 
i.e. no varying dilaton field and no varying axion field; the EEP for photon sector 
is observed; the spacetime constitutive tensor is of metric-induced form. Thus the 
three observational conditions are tied to EEP and to metric-induced spacetime 
constitutive tensor in the photon sector. 

In the next subsection, we look into the empirical support of no amplification/no 
attenuation and no polarization rotation conditions. 
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3.5. No amplification/no attenuation and no polarization rotation 
constraints on cosmic dilaton field and cosmic axion field 


In this section, we look into the observations/experiments to constrain the dilaton 
field contribution and the axion field contribution to spacetime constitutive tensor 
density. 

No amplification/no attenuation constraint on the cosmic field: From Eqs. (76) 
and (77) in the last section, we have derived that the amplitude and phase factor 
of propagation in the cosmic dilaton and cosmic Abelian axion field is changed by 
()(P,)/(P2))'/? x explikz — ikt + (—i)(y(P1) — y(P2))t]. The effect of dilaton field 
is to give amplification (w(P,) — (P2) > 0) or attenuation (w(P,) — u(P2) < 0) to 
the amplitude of the wave independent of frequency and polarization. 

The spectrum of the CMB is well understood to be Planck blackbody spectrum. 
In the cosmic propagation, this spectrum would be amplified or attenuated by 
the factor (w(P,)/w(P2))!/?. However, the CMB spectrum is measured to agree 
with the ideal Planck spectrum at temperature 2.7255 + 0.0006 K (Ref. 81) with a 
fractional accuracy of 2 x 10~4. The spectrum is also redshifted due to cosmological 
curvature (or expansion), but this does not change the blackbody character. The 
measured shape of the CMB spectra does not deviate from Planck spectrum within 
its experimental accuracy. In the dilaton field the relative increase in power is 
proportional to the amplitude increase squared, i.e. U(P1)/y(P2). Since the total 
power of the blackbody radiation is proportional to the temperature to the fourth 
power 7“, the fractional change of the dilaton field since the last scattering surface 
of the CMB must be less than about 8 x 1074 and we have 


A 
a < 4(0.0006/2.7255) = 8 x 1074. (78) 


Direct fitting to the CMB data with the addition of the scale factor w(P1)/w(P2) 
would give a more accurate value. 

Constraints on the cosmic polarization rotation and the cosmic axion field: From 
(77), for the right circularly polarized electromagnetic wave, the propagation from 
a point P, (4-point) to another point P, adds a phase of a = y(P2) — y(P1) to the 
wave; for left circularly polarized light, the added phase will be opposite in sign.>* 
Linearly polarized electromagnetic wave is a superposition of circularly polarized 
waves. Its polarization vector will then rotate by an angle a. In the global situation, 
it is the property of (pseudo)scalar field that when we integrate along light (wave) 
trajectory the total polarization rotation (relative to no y-interaction) is again 
a = Ay = ¢(P2) — y(P1) where y(P;) and y(P2) are the values of the scalar field 
at the beginning and end of the wave. The constraints listed on the axion field in 
Table 1 are from the UV polarization observations of radio galaxies and the CMB 
polarization observations — 0.02 for Cosmic Polarization Rotation (CPR) mean 
value |(@)| and 0.03 for the CPR fluctuations ((a@ — (a))?)1/2.82-*4 

Additional constraints to have the unique physical metric: From (78) the frac- 
tional change of dilaton |Aw|/w is less than about 8 x 10~* since the time of the 
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last scattering surface of the CMB. E6tvés-type experiments constrain the frac- 
tional variation of dilaton to ~ 107!°U where U is the dimensionless Newtonian 
potential in the experimental environment. Vessot—Levine redshift experiment and 
Hughes—Drever-type experiments give further constraints.®! All these constraints 
are summarized in Table 1. This leads to unique physical metric to high precision 
for all degrees of freedom except the axion degree of freedom and cosmic dilaton 
degree of freedom which are only mildly constrained. 


3.6. Spacetime constitutive relation including skewons®?:17 


In this subsection, we review the present status of empirical tests of full local linear 
spacetime constitutive tensor density (26) of premetric electrodynamics. Since EEP 
is verified to a good precision, we are mainly concerned with weak EEP violations 
and weak additional field, i.e. we are assuming y(°)7*! is metric and the components 
of x43"! are small in most parts of our treatment. We note that all the formulas 
in Sec. 3.2 are valid with or without skewonless assumption. 

In particular, the condition of (SI)B i) = ®)Bay =0 and Ai) =A for all direc- 
tions of wave propagation still gives (54a)—(541) without skewonless assumption. 

We do not assume skewonless condition in this subsection. The Hehl-Obukhov— 
Rubilar skewon field (27c) can be represented as 

(Sk), gk = guinks | — gong e. (79) 

n 


is a traceless tensor with S,,,"" = 0.° From (79), we have 
1 1 1 1 
(Sk), (1)1320 _ seh na = gf )2. (Sk), (1)1230 = gf ne rs gf 1 


where S,,, 


(Sk), (1)2310 _ styo ie got, (80) 
From (S*)(54g)—(S4)(541), we must have (!y(1)1320 — (Sk)y(1)1230 _ (Sk), (1)2810 
= 0. From (80) and TrS,, = 0, then all gs), gt, g),2 and gs must 
vanish. 
From (79) together with 8" (54a)—")(54f), we have 
1 1 1 1 1 1 
gM2 ss, gM sa sOt, gOt— 9a, (81) 
1 1 1 1 1 1 
gf is = gf 3. gf ya = gf i, gf 49 a2 gf io 
Using the Lorentz metric (h-metric in the locally inertia frame) to raise/lower 
the indices, we have 


g(t)mn = agian. sO). = i (82) 


Thus, when (S*)(54a)—(S*) (541) (nine independent conditions) are satisfied, the 
skewon degrees of freedom are reduced to 6 (15 — 9) and only Type II skewon field 
remains. 

Under Lorentz (coordinate) transformation, the symmetric part and the anti- 
symmetric part of S’” transform separately. Hence, with the conditions SB = 0 
for all directions of wave propagation, the skewon field is constrained to Type II. 
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The reverse is also true: Since °*"5,,,,, is a tensor, when it satisfy S")B = 0 for 
the z-axis of wave propagation, they satisfy “")B = 0 for all directions of wave 
propagation. Hence we have the lemma: 


Lemma. The following three statements are equivalent: 


(i) SB =0 for all directions, 
(ii) )(54a)-S*) (541) hold, 
(iii) (Sk) gan as defined by (79) can be written as (Sk) gin = SKIDS. with 
(SKI) g = (SKID) g 


Proof. (i) = (ii) has been demonstrated in the derivation of (S*) (54a)—S") (541). 

(ii) < (iii) has also been demonstrated in the derivation of (80)—(82) and its 
reversibility. 

(iii) > (i) ©'S;; is a tensor. If its anti-symmetric property holds in one 
frame, it holds in any frame. Hence, in any new frame with the propagation in 
the zdirection, *)(54a)—S") (541) hold and we have *)B = 0 for propagation 
in the z-direction. Since z-direction can be arbitrary, we have (Sk) B = 0 for all 
directions. 


The condition of Bay = Bay = 0 and Aq) = Aa) for all directions of wave 
propagation gives (56a), (56b). Define the anti-symmetric metric p” as follows: 
10 _ 


01 — 9 (SKID), (1)1220, 20 —. 02 — 9 (SID, (1)2330, 


p =p Pp —p 
p? = —p8 = 9(SkID, (13110, p2=—p = 2(SkID)) (1)1020, _ 
p? = —p? = 9 (SKID, (1)2030, pt = —p8 = 9(SkID, (1)3010, 


e pit pr? Se 0. 


It is straightforward to show now that when (54a)—(541) and (56a)—(56b) are 
satisfied, then . can be written to first-order in terms of the fields h‘)*J,w,y, and 
p4 with hY =n +h and h = det(hj;) in the following form: 


yt — (P)y (Ligh 4 (A)y (igh cf (SkID) (1)ighl 
1 Be aad , : ss 

on 5 (A) [Rh -_ he ary di pedkl 

Pl a ae aes (84) 
with 

(P) (Lapel 5(—h) nie —h™ AI, (85a) 
(A) (ight = pest (85b) 
(Si) Cage = re) ea pe nk n* py?! np*). (85c) 


It is ready to derive the following theorem. 
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Theorem. For linear electrodynamics with skewonful constitutive relation (26) with 
(SK)B = 0 satisfied for all directions, the following three statements are equivalent 
to first-order in the field: 


(i) Ag) = Ai) and (\B = 0 for all directions, i.e. nonbirefringence in electro- 
magnetic wave propagation, 
(ii) (58a)—(58j) hold, 
(iii) y* can be expressed as (84) with (85a)-(85c). 


The proof is similar to that for theorem in Sec. 3.3°; readers could readily figure 
it out. 

When the principal part (Py 
h') and dilaton, i.e. 


“kl of the constitutive tensor is induced by metric 


P 1 ae 1 sie. du 
(P),iael = (aye (5) nik pi! _ (5) nin wv, (86) 
it is easy to check by substitution that 
Aa) =A) and By = B=0, (87) 


We have € = —4((SB)?. The three cases discussed after Eq. (49) reduce to two 
cases: 


(a) € = 0, SB = 0: There are no birefringence and no dissipation/amplification 
in wave propagation; 

(b) € <0, SB F 0: There is no birefringence, but there are both dissipative and 
amplifying modes in wave propagation. 


Now the issue is: When the skewon part of the constitutive tensor is nonzero, what 
can we say about the spacetime structure empirically? 

If € is less than zero, ie. (Aq) — Agy)? + 4("B)? < 4(S"B)?, the dispersion 
relation (47) is 


wa k [1+ $y + Am) 5-8] +00). (88) 


The exponential factor in the wave solution (36) is of the form 


exp(ikz — iwt) ~ exp it: ik (1 5 (Aq) | Aw) J exp (25(-€'e). 
(89) 


There are both dissipative and amplifying wave propagation modes. In the 
small € limit, the amplification/attenuation factor exp(+1/2(—é€)!/?kt) equals 
[1 + 1/2(—€)/?kt] to a very good approximation. Since this factor depends on 
the wave number/frequency, it will distort the source spectrum in propagation. 

The spectrum of the CMB is well understood to be Planck blackbody spec- 
trum. It is measured to agree with the ideal Planck spectrum at temperature 
2.7255 + 0.0006 K.®! The measured shape of the CMB spectra does not deviate 
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from Planck spectrum within its experimental accuracy. The agreement for the 
overall shape with a fit to Planck plus a linear factor [1 + 1/2(—€)!/?k¢] is to agree 
with Planck to better than 107+. Planck Surveyor has nine bands of detection from 
30 to 857GHz.®° For weak propagation deviation, the amplitude of the wave is 
increased or decreased linearly as 1/2(—€ y/ 2kt depending on frequency. For cosmic 
propagation, the CMB amplitude change due to redshift (or blueshift) is universal. 
The frequency (wave number) change is proportional to (1+ .2(t)) with z(t) the red- 
shift factor at time t of propagation. We need to replace kt in the [1+1/2(—€)!/2kt] 
factor by the integral 


/ k(t)dt = / k(to)(1 + 2(t))at = (1 + (2(t)))k(to) (to — tr), (90) 


with (z(t)) the average of z(t) during propagation defined by the last equality of 
(90), to the present time (the age of our universe) and ¢, the time at the photon 
decoupling epoch. According to Planck 2013 results,®° the age of our universe to is 
13.8 Gyr, the decoupling time t; is 0.00038 Gyr, hence (to — ti) is ~ 13.8 Gyr, and 
z(t) is 1090. Using Planck ACDM concordance model, the factor (1 + (z(t))) is 
estimated to be about 3 and the value (1 + (z(t))) (to — t1) is more than 40 Gyr. 
The factor (1 + (z(t))) multiply by (to — t1) is the angular diameter distance Da 
at which we are observing the CMB and is equal to the comoving size of the sound 
horizon at the time of last-scattering, rs(z(t1)), divided by the observed angular 
size 0, = rs/D, from seven acoustic peaks in the CMB anisotropy spectrum. From 
Planck results, rs = 144.75 +0.66 Mpc and 0, = (1.04148 + 0.00066) x 10~?. Hence, 
we have Da = 1/6, = 13898 +64Mpc = 45.328 40.21 Gyr. This is consistent with 
our integral estimation. 

For the highest frequency band w is 27 x 857 GHz. The amplification/dissipation 
in fraction is 


1 
se) 8 x 45.328 Gyr = 3.8 x ort ey 74, (91) 


For the lowest frequency band w is 27 x 30 GHz; the effect is about +3.5% of 
(91). From CMB observations that the spectrum is less than 10~* deviation, we 
have 


(-o)" 296% it, (92) 


When the spacetime constitutive tensor is constructed from metric, dilaton and 
axion plus skewon, the principal part ‘)y*' of the constitutive tensor is given 
by (86). There are two cases, (a) SB = 0 and (b) ©B ¥ 0 as mentioned 
after Eq. (87). For case (a) € = 0, there are no birefringence and no dissipa- 
tion/amplification in wave propagation; by the theorem in this subsection, the 
skewon part must be of Type II. For case (b) € < 0, SB ¥ 0, there are both 
dissipative and amplifying modes in wave propagation and we can apply (92) from 
the CMB observations to constrain the skewon part of the constitutive tensor as 
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follows: 


1 1 
soo = [*)B| = 5|(Bay - Bey)! 


_ |(Sk),/(1)1020 as (Sk), (1)1323 _ (Sk), (1)1023 _ (Sk), (1)1320 < 1.3 x 10735, 


(93) 


for propagation in the z-direction. Since the CMB observation is omnidirectional, 
we have the above constraint for many directions. From a few superpositions, we 
obtain the lemma in this subsection, hence the constraints (54a)—(541) hold to 
~ a few x 20~°° and the spacetime skewon field is Type II with Type I skewon 
field constrained to ~ a few x 20~°° cosmologically in the first-order. Thus, the 
significant skewon field must be of Type II with six degrees of freedom in the first- 
order. 


Constraints on the skewon field in the second-order" 

For metric principal part plus skewon part, we have shown that the Type I ske- 
won part is constrained to < a few x10~*° in the weak field/weak EEP violation 
limit. Type II skewon part is not constrained in the first-order. In the second- 
order Obukhov and Hehl have shown in Sec. IV.A.1 of Ref. 86 that it induces 
birefringence; since the nonbirefringence observations are precise to 10738 
in Table 1, they constrain the Type II skewon part to ~ 10719.1%86 However, 
an additional nonmetric induced second-order contribution to the principal part 
constitutive tensor compensates the Type II skewon birefringence and makes it 
nonbirefringent.!” This second-order contribution is just the extra piece to the 
(symmetric) core metric principal constitutive tensor induced by the antisymmet- 
ric part of the asymmetric metric tensor q‘.!" Table 3 lists various first-order and 
second-order effects in wave propagation on media with the core metric-based con- 


as listed 


stitutive tensors.’” In the following subsection, we review the spacetime/medium 
with constitutive tensor induced from asymmetric metric. 


3.7. Constitutive tensor from asymmetric metric and Fresnel 
equation 


Eddington,®” Einstein and Straus,°* and Schrédinger®?-” 


metric in their exploration of gravity theories. Just like we can build spacetime 


considered asymmetric 


constitutive tensor from the (symmetric) metric as in metric theories of gravity, we 
can also build it from the asymmetric metric. Let gq’? be the asymmetric metric as 
follows: 
: 1 ae a 
resale Hae"), (94) 
with q = det~'(‘S)q‘7). When q‘! is symmetric, this definition reduces to that of the 
metric theories of gravity. The constitutive law (94) was also put forward by Lindell 


Table 3. Various first-order and second-order effects in wave propagation on media with the core metric-based constitutive tensors. (P)y©) is 
the extra contribution due to antisymmetric part of asymmetric metric to the core metric principal part for canceling the skewon contribution 


to birefringence/amplification-dissipation. 


Constitutive tensor density x#7*! Birefringence (in the geometric _Dissipation/ Spectroscopic CPR 
optics approximation) amplification distortion 
Metric: No No No No 
(1/2)(—h)1/? [he hs? = he nhs] 
Metric + dilaton: No (to all orders in the field) Yes (due to No No 
(1/2)(—h)1/? [he hd! — hE AK dilaton gradient) 
Metric + Abelian axion: No (to all orders in the field) No No Yes (due to axion 
(1/2)(—h)1/? [he nd! — hU AK] + ped! gradient) 
Metric + dilaton + Abelian axion: No (to all orders in the field) Yes (due to dilaton No Yes (due to axion 
(1/2)(—h) 1/2 [n*e nd! — hU AK] + vet gradient) gradient) 
Metric + Type I skewon No to first-order Yes Yes No 
Metric + Type II skewon No to first-order; No to first-order No No 
yes to second-order and to second-order 
Metric + (P)y(e) 4 Type II skewon No to first-order; No to first-order No No 
no to second-order and to second-order 
Asymmetric metric induced: No (to all orders in the field) No No Yes (due to axion 


(1/2)(—a)'/? (qik qi! — qitgi*) 


gradient) 


IN (LM 
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and Wallen?! as Q-medium. Resolving the asymmetric metric into symmetric part 
(S)qJ and antisymmetric part “4)q‘J: 


gi =O 4 Agi, with Og =—(g%4qi) and Agi = 


Dole 


we can decompose the constitutive tensor into the principal part Py", the axion 


part (4%)! and skewon part Sy J! as follows®-9?: 
vill _ 5(—a) (ahha _ gig’*) = (P), aah i (Ax), tsk! 4) au (96a) 
with 
(P) dik — 5(—a)'2( gh Ogi — (8) gil(S) git 
+ (A) gik(A) git _ (A) gil(A) gik _ ofA) gli&(A) gill), (96b) 
(Ax) idl = (_g)1/2(A) glik(A) gi (96c) 
(Sk), ijkl — 5(—a) 2 (MgB) gi — (A) gil (8) gik 4 GS) gik(A) git _ GS) gil(A) giky, 


(96d) 


The axion part (Ast) aaht only comes from the second-order terms of hy, 
Using S)q‘J to raise and its inverse to lower the indices, we have as Eq. (16) in 


Ref. 62 
1 : : ae 
Si; = 5eiima gs; ie) rae = _emkij Sij, (97) 


where €jjmz and e™kJ are respectively the completely antisymmetric covariant and 
contravariant tensors with ¢°!23 
the skewon field $;; from asymmetric metric q* is antisymmetric and is of Type II. 


= 1 and €9123 = —1 in local inertial frame. Thus 


Dispersion relation in the geometrical optics limit. The dispersion relation for the 
wave covector gq; of electromagnetic propagation with general constitutive tensor 
(26) in the geometric-optics limit is given by the generalized covariant Fresnel 
equation®: 


GC" (x)qiqjana = 0, (98) 
where GY¥'!(y)(=GIF) (y)) is a completely symmetric fourth-order Tamm-Rubilar 
(TR) tensor density of weight +1 defined by 


yi 1 mnr(t. j|ps|k U 
Gi (y) = @ Boananpamice ( yilP [ke Aat ; (99) 


There are two ways to obtain the TR tensor density (99) for the dispersion relation 
(98). One way is by straightforward calculation; the other is by covariant method.” 
In the appendix of arXiv:1411.0460v1, we outline the straightforward calculation 
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to obtain the TR tensor density G¥*'(y) for the asymmetric metric induced consti- 
tutive tensor: 


- 1 er 1 6s _ 
G(x) = (F) -ap/2aerta alah = (3) (a) 2den(a) 44h, 
(100) 


Except for a scalar factor, (100) is the same as for metric-induced constitutive 
tensor with Sq; replacing the metric gj; or hj;. Therefore in the geometric optical 
approximation, there is no birefringence and the unique light cone is given by the 
metric Sais. 

Constraints on asymmetric-metric induced constitutive tensor.” Although the 
asymmetric-metric induced constitutive tensor leads to a Fresnel equation which is 
nonbirefringent, it contains an axionic part: 

(Asn) Akt = (—¢) A (A) gi] = ped. v= (a) exjet(—q)t/ 2A) glkA) gill 
(101) 


which induces polarization rotation in wave propagation. Constraints on CPR 
and its fluctuation limit the axionic part and therefore also constrain the asym- 
metric metric. The variation of y(= (1/4!)eijx(—q) 1/2?) qi"* qi) is limited by 
observations®?—*4-0,61 on the CPR to < 0.02 and its fluctuation to < 0.03 since 
the last scattering surface, and in turn constrains the antisymmetric metric of the 
spacetime for this degree of freedom. The antisymmetric metric has six degrees of 
freedom. Further study of the remaining five degrees of freedom experimentally to 
find either evidence or more constraints would be desired. 

Theoretically, there are two issues: one is whether the asymmetric-metric 
induced constitutive tensors with additional axion piece are the most general non- 
birefringent media in the lowest geometric optics limit; the other is what they play 
in the spacetime structure and in the cosmos. 


3.8. Empirical foundation of the closure relation for skewonless 


case” 62 


There are two equivalent definitions of constitutive tensor which are useful in various 
discussions (see, e. g. Ref. 6). The first one is to take a dual on the first two indices 
of » adie 


rae a ee 
Ki" = (5) ie (102) 


is the completely antisymmetric tensor density of weight —1 with 
€o123 = 1. Since e;,,,,, is a tensor density of weight —1 and ymnkl a tensor density 
of weight +1, «;;*! is a (twisted) tensor. From (102), we have 


al - 
ilies = (5) ere, (103) 


where €;imn 
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With this definition of constitutive tensor «;;*! 


becomes 


, the constitutive relation (12) 


“Hig = wig Fit, (104) 


where *H;; is the dual of H”, i.e. 


1 


The second equivalent definition of the constitutive tensor is to use a 6 x 6 matrix 


representation fe . Since rej is nonzero only when the antisymmetric pairs of 


indices (ij) and (kl) have values (01), (02), (03), (23), (31), (12), these index pairs 
can be enumerated by capital letters J, J,... from 1 to 6 to obtain «,7(= ig”). 
With the relabeling, F;; > Fr,H‘) = ED” ssn = e,7,e™" — el. We have 


Fy = (E,—B) and (*H); = (-H,—D).e;,; and e”’ can be expressed in matrix 
form as 


0 I 
eyy = = F : : (106) 
3 


where Is is the 3 x 3 unit matrix. In terms of this definition, the constitutive relation 
(104) becomes 


"A, = 2K? Fy, (107) 

where *H; = *Hi; = e17H”. The axion part ‘Ay! (= ye) now corresponds to 
I 0 

OMe =|” = ye, (108) 
0 Iz 


where Ig is the 6 x 6 unit matrix. The principal part and the Abelian axion part of 
the constitutive tensor all satisfy the following equation (the skewonless condition): 


KJ 


eng? =e Kn y®. (109) 


In terms of «;;*! and re-indexed «7, the constitutive tensor (60) is represented 
in the following forms: 


1 mn. 1 m nm 
ag” = (5) LijmnX i = (5) Cijmn(—h) 7h ‘i w+ ae (110) 
1 ME LN, 
Kr? = (5) Cigmn(—h) 7h ‘h ‘p oT yor", (111) 
where 4; kl is a generalized Kronecker delta defined as 
kl os kel lsk 
by S00, = 009s (112) 
In the derivation, we have used the formula 


Eijmn 


get Si (113) 
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kl 


Let us calculate K;;""«41?4 for the constitutive tensor (110): 


° 1 m n. 
Rage = (5) Cijmn(—h) ene nab + obs" 


x € ) Cpirs(—h) PPA + a 
1 


| 
ac 
ac 


where we have used (113) and the following relations 


) 4 PIy? + 26, ge ea h)/*hTP Asp) 


4, Paap? + dgolP) iq, PT 26, P1y?, (114) 


Chips eh ATP Hed = e™™PA det (h”), (115) 
det(h””) = [det(hy»)|7* = ho, (116) 
a a = ae (117) 


In terms of the six-dimensional index I, Eq. (114) becomes 


é 1 1 
Kyl KK _ (5) rij ny Pt = -(3) 5K a 2?) Ky S 5, Pe" 


1 
-(3) we 42a eae. (118) 


Thus the matrix multiplication of ora with itself is a linear combination of itself 
and the identity matrix, and generates a closed algebra of linear dimension 2. The 
algebraic relation (118) is a closure relation that generalizes the following closure 
relation in electrodynamics: 


m= (ey ef y= G) tr(Ke)Ig. (119) 


The matrix multiplication of «7 satisfies the closure relation (119). In case y = 0, 
the Abelian axion part ‘4*)«7 of the constitutive tensor vanishes and (118) reduces 
to the closure relation (119). 

From the nonbirefringence condition (60), we derive the closure relation (118) 
in a number of algebraic steps which consist of order 100 individual operations 
of addition/subtraction or multiplication. Equation (60) is empirically verified to 
10~%8. Therefore Eq. (118) is empirically verified to 10~°" (precision 10~°° times 
1001/2). Hence, when there are no axion and no dilaton, the closure relation (119) 
is empirically verified to 10~°”. For dilaton is constrained to 8 x 10~4, if one allow 
for dilaton, relation (119) is verified to 8 x 10~* since the last scattering surface 
of CMB; for axion is constrained to about 10~?, if one allow for axion in addition, 
relation (119) is verified to about 10~? since the last scattering surface of CMB. 
As pointed out by Favaro (private communication), the above method could also 
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readily be applied to the other three variants of closure relations (Eqs. (3.2), (3.3), 
(3.4) in Ref. 92). 

The closure relation (119) can also be called idempotent condition for it states 
that the multiplication of « by itself goes back essentially to itself. Toupin,®* 
Schonberg” and Jadezyk® in their theoretical approach started from this condition 
to obtain metric induced constitutive tensor with a dilaton degree of freedom. In 
this section, we have started with Galileo equivalence principle for photons, i.e. the 
nonbirefringence condition, to obtain the metric induced core metric form with a 
dilaton degree of freedom and an axion degree of freedom for the constitutive tensor 
and then the generalized closure relation (118). We have also shown that (118) is 
verified empirically to very high precision. Thus in the axionless (and skewonless) 
case, the birefringence condition and idempotent condition are equivalent and both 
are verified empirically to high precision. 


4. From Galileo Equivalence Principle to Einstein Equivalence 
Principle 


In Sec. 3, we have used equivalence principles in the photon sector to constrain 
the gravitational coupling to electromagnetism and the structure of spacetime from 
premetric electrodynamics. In this section, we review and discuss theoretically to 
what extent Galileo equivalence principle leads to EEP, i.e. Schiff’s conjecture. 

In 1970s, we used Galileo Equivalence Principle and derived its consequences for 
an electromagnetic system with Lagrangian density LD (= te + pee EAP?) 


where the electromagnetic field Lagrangian im and the field—current interaction 


Lagrangian io are given by (17a), (17b), and the particle Lagrangian L®) 
is given by —Yymy,(dsz)/(dt)d(a@ — ar) with m; the mass of the Jth particle, s; 
its 4-line element from the metric g;;,x; its position 3-vector, x the coordinate 
3-vector, and ¢ the time coordinate??-!: 


p= a 


1 vf ds 
a ikl pp. . : k I 
= (=) xX Fig Fr Ay J “yrmy dt O(a x1), (120) 
d k 
Je = yer 16(x — 27). (120a) 


Here e; is the charge of the Ith particle. In (120), only the part of y¥* 
metric under the interchange of index pairs 7j and kl contributes to the Lagrangian, 


i.e. the constitutive tensor is effectively skewonless. This framework is termed x —g 


which is sym- 


framework. 


The result of imposing Galileo Equivalence Principle is that the constitutive 


tensor density y¥* can be constrained and expressed in metric form with additional 
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pseudoscalar (axion) field vy: 
ak = (=p) (5) th gi! = (5) gg" +4 pet, (121) 


where g’ is the metric of the geodesic motions of particles, gi; is the inverse of 
g4,g = det(gi;) and eVkl ig the completely anti-symmetric tensor density with 
e923 — 1 as defined in Sec. 3. Hence the metric g’ generates the light cone for 
electromagnetic wave propagation also. The constraint (121) dictates the gravity 
coupling to electromagnetic field to be metric plus one additional axionic freedom. 
With this one axionic freedom the EEP is violated, and therefore the Schiff’s con- 
jecture is invalid. However, the spirit of Schiff’s conjecture is useful and constrains 
the gravity coupling effectively. Since the theory with constitutive tensor density 
(121) does not obey EEP, it is a nonmetric theory. 

The theory with y 4 0 is a pseudoscalar theory with important astrophysical 
and cosmological consequences. Its effect on electromagnetic wave propagation is 
that the polarization rotation of linearly polarized light is proportional to the dif- 
ference of the (pseudo)scalar field at the two end points. We have discussed this 
in detail in Sec. 3.4 and use CPR observations to constrain it. This is an example 
that investigations in fundamental physical laws lead to implications in cosmology. 
Investigations of CP problems in high energy physics lead to a theory with a similar 
piece of Lagrangian with y the axion field for QCD.°°- 108 

In the nonmetric theory with y“*'(y~ 4 0) given by Eq. (121),2%?!40-53 there 
are anomalous torques on electromagnetic-energy-polarized bodies so that different 
test bodies will change their rotation state differently, like magnets in magnetic 
fields. Since the motion of a macroscopic test body is determined not only by its 
trajectory but also by its rotation state, the motion of polarized test bodies will not 
be the same. We, therefore, have proposed the following stronger weak equivalence 
principle (WEP II) to be tested by experiments, which states that in a gravitational 
field, both the translational and rotational motion of a test body with a given initial 
motion state is independent of its internal structure and composition (universality 
of free-fall motion) (Sec. 2.2).2°?! To put in another way, the behavior of motion 
including rotation is that in a local inertial frame for test-bodies. If WEP II is 
violated, then EEP is violated. Therefore from above, in the y — g framework, the 
imposition of WEP II guarantees that EEP is valid. These are the reasons for us to 
propose WEP II. The ,—g framework has been extended to nonabelian gauge fields 
for studying the interrelations of equivalence principles with similar conclusions.!°* 

From the empirical side, WEP I for unpolarized bodies is verified to very high 
precision. However, these experiments only constrain two degrees of freedom of y’s 
for connecting with gravity coupling of matter. To constrain and connect more 
degrees of freedom of x’s to gravity coupling of matter, we propose to perform 
WEP experiments on various polarized test-bodies in 1970s — both electromag- 
netic polarized and spin polarized test bodies. These polarized experiments are 
also crucial to probe the role of spin and polarization in gravity. Now with the 
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spacetime constitutive tensor density constrained to the core metric form (60) 
to ultra-precision 1073°, the polarized WEP experiments will test the gravity— 
matter interaction more than gravity—radiation interaction. In Sec. 7, we will update 
6! on the search for the long range/intermediate range spin-spin, spin— 
monopole and spin—cosmic interactions. 


our review 


5. EEP and Universal Metrology 


EEP states that all local physics are same everywhere at any time in our cosmos. 
Therefore if we base our metrology everywhere at anytime on local physics with a 
universal procedure, we have a universal metrology (see, e.g. Refs. 105 and 106). 
For metrology, we need unit standards. At present all basic standards except for the 
prototype mass standard are based on physical laws, their fundamental constants 
and the microscopic properties of matter. The EEP says, in essence, local physics 
is the same everywhere. Therefore, to the precision of its empirical tests, EEP 
warrants the universality of these standards and their implementations. 

The name Systéme International d’Unités (International System of Units), with 
the abbreviation SI, was adopted by the 11th Conférence Générale des Poids et 
Mesures in 1960. After 1983 redefinition of meter as the length of path traveled by 
light in a vacuum during a time interval of 1/299792458 of a second, all definition 
of SI units can be traced to the definition of second and kilogram. The second is 
defined as the duration of 9 192 631 770 periods of the radiation corresponding to the 
transition between the two hyperfine levels of the ground state of the cesium-133 
atom. The kilogram is the unit of mass; it is equal to the mass of the interna- 
tional prototype of the kilogram (a cylinder of platinum—iridium) (IPK). IPK is the 
only physical artifact in the definition of SI 7 base units (second, meter, kilogram, 
ampere, kelvin, mole and candela for 7 base quantities time, length, mass, electric 
current, thermodynamic temperature, amount of substance and luminous intensity 
respectively). Although the uncertainty of the mass of IPK is zero by convention, 
there are evidence that the mass of IPK varies with a fraction of the order of 107° 
after storage or cleaning with the estimated relative instability dm/m ~ 5 x 1078 
over the past 100 years.!°’ When the mass unit is redefined by natural invariants, 
the SI system will be free of artifacts. In order to ensure continuity of mass metrol- 
ogy, it has been agreed that the relative uncertainty of any new realization must 
be less than 2 x 1078 (see, e.g. Ref. 108). Sanchez et al.!°° in National Research 
Council of Canada determined the Planck’s constant h using the watt balance 
to be 6.62607034(12) x 10-34 J s within 2 x 107° relative uncertainty. NIST has 
reached 5 x 107° relative uncertainty and is building a new watt balance to reach 
2x 10-® relative uncertainty.!!° The silicon sphere experiment of counting atoms 
to determine the Avogadro constant reached 3 x 10~® relative uncertainty (see, e.g. 
Ref. 108). In 2014, the Avogadro constant Na and derived Planck constant h based 
on the absolute silicon molar mass measurements with their standard uncertain- 
ties are 6.02214076(19) x 1075 mol~* and 6.62607017(21) x 10~*4 J s.!"! The three 
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measurements of NIST,!!! PTB,!!? and NMIJ'!8 agree within their stated uncer- 
tainties and also agree with the NRC watt balance measurement with lo. These 
experimental progresses set the stage for a new definition of kilogram using Planck 
constant /Avogadro number. Time is becoming mature to replace all the definitions 
of units using natural invariants. 

In 2018, the 5 SI base quantities — time, length, mass, electric current and ther- 
modynamic temperature — will be replaced by frequency, velocity, action, electric 
charge and heat capacity, pending upon the expected final resolution of the 26th 
Conférence Générale des Poids et Mesures (CGPM) (see, e.g. Ref. 110). The two 
defining constants for frequency and velocity will be the same as the present SI 
defining constants of time and length. The defining constants for action, electric 
charge, heat capacity and amount of substance will be the Planck constant h, the 
elementary charge e, the Boltzmann constant & and the Avogadro constant Na 
respectively. The mass unit can be traced to action unit defined by the Planck 
constant using watt balance or to amount of substance defined by the Avogadro 
constant based on counting the atoms in a 78Si crystal. In 2018, both methods 
should reach an uncertainty smaller than 2 x 107° to guarantee consistency and 
continuity. The relative uncertainty of Nah at present is 7 x 10~1° (CODATA 2010 
adjustment!!*) to guarantee consistency at the 2 x 10~® level. 

With the new definition of units based on physical invariants of nature, the 
applicability becomes wider; as long as the physical laws which the units are based 
are valid, the standards and metrology are universal. In Sec. 3, we have seen that 
the unique light cone is experimentally verified to 10735 via gamma ray obser- 
vations at cosmological distance; it verifies the Galileo equivalence principle for 
photons/electromagnetic wave packets to this accuracy. This constrains the space- 
time (vacuum) constitutive tensor to core metric form with additional dilaton and 
axion degrees of freedom. In the solar system the variation of the dilaton field is con- 
strained to 10~!°U; in the cosmos, the dilaton field is constrained to 8 x 10~* (Table 
1). The universal metrology system is truly universal with the present accuracies. In 
case the accuracies are pushed further, we either verify equivalence principles fur- 
ther or discover new physics. Thus we see that universal metrology and equivalence 
principles go hand-in-hand. 

Equivalence principles play very important roles both in the Newtonian the- 
ory of gravity and relativistic theories of gravity. The ranges of validity of these 
equivalence principles or their possible violations give clues and/or constraints to 
the microscopic origins of gravity. They will be even more important when the 
precisions of the tests become higher. To pursue further tests of EEP, we have to 
look into precise experiments and observations in our laboratory, in the solar sys- 
tem, and in diverse astrophysical and cosmological situations. All of these depend 
on the progress in the field of precision measurement, and demands more precise 
standards. The constancy of constants is implied by equivalence principles. Their 
variations give new physics. 
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The frequency measurement has the best relative uncertainty at present. The 
optical clocks are reaching relative uncertainties at the 107! level.1'° When the 
comparison of optical clocks becomes common, it is anticipated that the frequency 
standards will go optical. Further improvement in the frequency measurements will 
have profound impact on precision measurement and gravity experiment. In the 
realm of gravitational wave detection, the influence will be to enhance the Doppler 
tracking method and the PTA method.!!® An array of clocks may even become an 
alternate method for detecting low frequency gravitational waves. 


6. Gyrogravitational Ratio 


Gyrogravitational effect is defined to be the response of an angular momentum in 
a gravitomagnetic field produced by a gravitating source having a nonzero angular 
momentum. Ciufolini and Pavlis'!” have measured and verified this effect with 
10 — 30% accuracy for the dragging of the orbit plane (orbit angular momentum) 
of a satellite (LAGEOS) around a rotating planet (Earth) predicted for general 
relativity by Lense and Thirring.'!® Gravity Probe B!!9 (GP-B) has measured and 
verified the dragging of spin angular momentum of a rotating quartz ball predicted 
by Schiff!?° for general relativity with 19% accuracy. GP-B experiment has also 
verified the Second Weak Equivalence Principle (WEP IT) for macroscopic rotating 
bodies to ultra-precision.‘?! On 13 February 2012 the Italian Space Agency (ASI) 
launched the LARES (LAser RElativity Satellite) satellite with a Vega rocket for 
improving the measurement of Lense—Thirring effect together with other geodesy 
satellites.1?? On Earth, GINGER, (Gyroscopes IN General Relativity) is a multi- 
ring-laser array project aimed to measure the Lense—Thirring effect to 1%.!?? When 
this is achieved, the same technology could be applied to improve the tie between 
the astronomic reference frame and the solar-system dynamical frame. 

Just as in electromagnetism, we can define gyrogravitational factor as the grav- 
itomagnetic moment (response) divided by angular momentum for gravitational 
interaction. We use macroscopic (spin) angular momentum in GR as standard, its 
gyrogravitational ratio is 1 by definition. In Ref. 124, we use coordinate transforma- 
tions among reference frames to study and to understand the Lense—Thirring effect 
of a Dirac particle. For a Dirac particle, the wave function transformation oper- 
ator from an inertial frame to a moving accelerated frame is obtained. According 
to equivalence principle, this gives the gravitational coupling to a Dirac particle. 
From this, the Dirac wave function is solved and its change of polarization gives 
the gyrogravitational ratio 1 from the first-order gravitational effects. In a series 
of papers on spin—gravity interactions and equivalence principle, Obukhov, Silenko 
and Teryaev!?° have calculated directly the response of the spin of a Dirac par- 
ticle in gravitomagnetic field and showed that it is the same as the response of 
a macroscopic spin angular momentum in general relativity (see also Ref. 126 for 
a derivation in the weak-field limit). Randono has showed that the active frame- 
dragging of a polarized Dirac particle is the same as that of a macroscopic body 
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with equal angular momentum.!?" All these results are consistent with EEP and 
the principle of action-equal-to-reaction. However, these findings do not preclude 
that the gyrogravitational ratio is to be different from 1 in various different theories 
of gravity, notably torsion theories and Poincaré gauge theories. 

What would be the gyrogravitational ratios of actual elementary particles? If 
they differ from one, they will definitely reveal some inner gravitational structures 
of elementary particles, just as different gyromagnetic ratios reveal inner electro- 
magnetic structures of elementary particles. These findings would then give clues 
to the microscopic origin of gravity. 

Promising methods to measure particle gyrogravitational ratio include®!: (i) 
using spin-polarized bodies (e.g. polarized solid He?, Dy-Fe, Ho—Fe, or other com- 
pounds) instead of rotating gyros in a GP-B type experiment to measure the gyro- 
gravitational ratio of various substances; (ii) atom interferometry; (iii) nuclear spin 
gyroscopy; (iv) superfluid He? gyrometry. Notably, there have been great develop- 


128,129 and nuclear gyroscopy.!°° However, to measure 


ments in atom interferometry 
particle gyrogravitational ratios the precision is still short by several orders and 


more developments are required. 


7. An Update of Search for Long Range/Intermediate Range 
Spin-—Spin, Spin—Monopole and Spin—Cosmos Interactions 


61,131 


In this section, we update our review on the search for the long 


range/intermediate range spin-spin, spin-monopole and spin—cosmic interactions. 


Spin-spin experiments 


Geomagnetic field induces electron polarization within the Earth. Hunter et al.°? 


estimated that there are on the order of 104? polarized electrons in the Earth 


compared to ~ 10?° polarized electrons in a typical laboratory. For spin-spin inter- 
action, from their results there is an improvement in constraining the coupling 


strength of the intermediate vector boson in the range greater than about 1 km.!8? 
Spin—monopole experiments 


In Ref. 61, we have used axion-like interaction Hamiltonian 


= R(9s9p) 1 ; 1 Ae z 
i | 8rme Ar 2 poke Ar ita (122) 


to discuss the experimental constraints on the dimensionless coupling gsgp)/h 
between polarized (electron) and unpolarized (nucleon) particles. In (122), X is 
the range of the interaction, gs and gp are the coupling constants of vertices at the 


polarized and unpolarized particles, m is the mass of the polarized particle and o is 
Pauli matrix 3-vector. Hoed] et al.'°? have pushed the constraint to shorter range 
by about one order of magnitude since our last review.°! In this update, we see also 
good progress in the measurement of spin—monopole coupling between polarized 
neutrons and unpolarized nucleons.!*4~!8° Tullney et al.!2° obtained the best limit 
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on this coupling for force ranges between 3 x 1074 m and 0.1m. Regards to a recent 
analysis of a direct spin-axion momentum interaction and its empirical constraints, 
see Ref. 137. 


Spin—cosmos experiments 


For the analysis of spin—-cosmos experiments for elementary particles, one usually 
uses the following Hamiltonian: 


Heosmic = C101 + Coo2 + C303, (123) 


in the cosmic frame of reference for spin half particle with C’s constants and a’s 
the Pauli spin matrices (see, e.g. Ref. 138 or 61). The best constraint now is on 
bound neutron from a free-spin-precession *He—!?9Xe comagnetometer experiment 
performed by Allmendinger et al.!°° The experiment measured the free preces- 
sion of nuclear spin polarized *He and !*°Xe atoms in a homogeneous magnetic 
guiding field of about 400nT. As the laboratory rotates with respect to distant 
stars, Allmendinger et al. looked for a sidereal modulation of the Larmor frequen- 
cies of the collocated spin samples due to (123) and obtained an upper limit of 
8.4 x 1074 GeV (68% c.l.) on the equatorial component C’ for neutron. This con- 
straint is more stringent by 3.7 x 10+ fold than the limit on that for electron.'%9 
Using a *He-K co-magnetometer, Brown et al.'4° constrained C?) for the proton 
to be less than 6 x 107°? GeV. Recently Stadnika and Flambaum‘*! analyzed the 
nuclear spin contents of 7He and !*°Xe together with a re-analysis of the data of 
Ref. 130 to give the following improved limit on C1? : Cl, < 7.6 x 107% GeV. 


8. Prospects 


After the cosmological electroweak (vacuum) phase transition around 100 ps from 
the Big Bang, high energy photons came out. At this time it is difficult to do 
measurement, although things may still evolve according to precise physical law — 
notably quantum electrodynamics and classical electrodynamics. When our uni- 
verse cooled down, precision metrology became possible. Metrological standards 
could be defined and implemented according to the fundamental physical laws. The 
cosmic propagation according to Galileo’s Weak Equivalence Principle for pho- 
tons (nonbirefringence) in the framework of premetric classical electrodynamics 
of continuous media dictates that the spacetime constitutive tensor must be of 
core metric form with an axion (pseudoscalar) degree of freedom and a dilaton 
(scalar) degree of freedom. Propagation of pulsar pulses, radio galaxy signals and 
cosmological GRBs has verified this conclusion empirically down to 107°°, i.e. to 
10-4 x O([Mniges/MPianck|*). This is also the order that the generalized closure 
relations of electrodynamics are verified empirically. The axion and dilaton degrees 
of freedom are further constrained empirically in the present phase of the cosmos 
(Table 1). However, we should give a different thought to the axion and dilaton 
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degrees of freedom in exploring spacetime and gravitation in the very early uni- 
verse within 100 ps from the “Big Bang”; we could look for imprints of new physics 
and new principles. 

On the other hand, experiments with spin are important in verifying Galileo 
Equivalence Principle and Einstein Equivalence Principle which are important cor- 
nerstones of spacetime structure and gravitation. It is not surprising that cosmo- 
logical observations on polarization phenomena become the ultimate test ground of 
the equivalence principles, especially for the photon sector. Some of the dispersion 
relation tests are reaching second-order in the ratio of Higgs boson mass and Planck 
mass. Ultra-precise laboratory experiments are reaching ground in advancing con- 
straints on various (semi-)long-range spin interactions. Sooner or later, experimental 
efforts will reach the precision of measuring the gyrogravitational ratios of elemen- 
tary particles. All these developments may facilitate ways to explore the origins of 
gravity. 
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Cosmic polarization rotation: An astrophysical test 
of fundamental physics* 
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Possible violations of fundamental physical principles, e.g. the Einstein equivalence prin- 
ciple on which all metric theories of gravity are based, including general relativity (GR), 
would lead to a rotation of the plane of polarization for linearly polarized radiation trav- 
eling over cosmological distances, the so-called cosmic polarization rotation (CPR). We 
review here the astrophysical tests which have been carried out so far to check if CPR 
exists. These are using the radio and ultraviolet polarization of radio galaxies and the 
polarization of the cosmic microwave background (both E-mode and B-mode). These 
tests so far have been negative, leading to upper limits of the order of one degree on any 
CPR angle, thereby increasing our confidence in those physical principles, including GR. 
We also discuss future prospects in detecting CPR or improving the constraints on it. 


Keywords: Polarization; radio galaxies; cosmic background radiation. 


1. Introduction 


Linear polarization is a simple phenomenon by which a single photon is able to 
transmit across the universe the information about the orientation of a plane. The 
question which we discuss in this paper is whether the orientation of the plane 
of linear polarization, the so-called position angle (PA®), is conserved for electro- 
magnetic radiation traveling long distances, i.e. if there is any cosmic polarization 
rotation (CPR). Clearly, if the CPR angle a is not zero, symmetry must be broken 
at some level, since a must be either positive or negative, for a counterclockwise 
or clockwise rotation. This immediately suggests that CPR should be connected 
with the violation of fundamental physical principles. Indeed, it is linked also to a 
possible violation of the Einstein equivalence principle (EEP), which is the founda- 
tion of any metric theory of gravity, including general relativity (GR). Therefore it 
deserves a chapter in this volume. 


*This article was also published in Int. J. Mod. Phys. D 24, 1530016 (2015). This version has 
been updated to include the results of Planck and POLARBEAR. 

®We adopt the International Astronomical Union (IAU) convention for PA: it increases 
counterclockwise facing the source, from North through East.°° 
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The fundamental principles whose violation would imply CPR are briefly dis- 
cussed in Sec. 2 (please refer also to other chapters in this volume). For most of 
them, the CPR angle would be independent of wavelength. However the violation of 
some principles would imply a wavelength-dependent CPR, not to be confused with 
the Faraday rotation, which is a well-known effect for radiation passing through a 
plasma with a magnetic field. CPR, if it exists, would occur in vacuum. CPR has 
sometimes been inappropriately called “cosmological birefringence.” However we 
follow here the advice of Ni,®° since birefringence is only appropriate for a medium 
whose index of refraction depends on the direction of polarization of the incident 
light beam, which is then split in two. The phenomenon we are considering here is 
pure rotation of the polarization, without any splitting. 

Testing for CPR is simple in principle: it requires a distant source of linearly 
polarized radiation, for which the orientation PAgm of the polarization at the emis- 
sion can be established. Then CPR is tested by comparing the observed orientation 
PAops with PAgm: 


CS PAobs = PAem- 


In practice, it is not easy to know a priori the orientation of the polarization 
for a distant source: in this respect the fact that scattered radiation is polarized 
perpendicularly to the plane containing the incident and scattered rays has been 
of great help, applied both to radio galaxies (RGs) (see Sec. 4) and to the cosmic 
background (CMB) radiation (see Sec. 5). For those cases in which CPR depends 
on wavelength, one can also test CPR by simply searching for variation of PA with 
the wavelength of the radiation, even without knowing PAgm. In this paper, we 
will review the astrophysical methods which have been used to test CPR, we list 
the results of these test, discuss the advantages and disadvantages of the various 
methods and suggest future prospects for these tests. 


2. Impact of CPR on Fundamental Physics 


This possibility of CPR arises in a variety of important contexts, like the presence 
of a cosmological pseudoscalar condensate, Lorentz invariance violation and charge, 
parity and time reversal (CPT) violation, neutrino number asymmetry, the EEP 
violation. In particular, the connection of the latter with CPR is relevant for this 
GR, Centennial year, since all metric theories of gravity, including GR are based on 
the EEP. Since the weak equivalence principle (WEP) is tested to a much higher 
accuracy than the EEP, Schiff®* conjectured that any consistent Lorentz-invariant 
theory of gravity which obeys the WEP would necessarily also obey the EEP. If 
these were true, the EEP would tested to the same accuracy as the WEP, increas- 
ing our experimental confidence in GR. However, Ni®”°® found a unique counter 
example to Schiff’s conjecture: a pseudoscalar field which would lead to a violation 
of the EEP, while obeying the WEP. Such field would produce a CPR. Therefore, 
testing for the CPR is important for our confidence in GR. For the other theoretical 
impacts of CPR we refer the reader to Refs. 59, 60 and 62. 
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3. Constraints from the Radio Polarization of RGs 


Already in his seminal paper about the unique counter-example to Schiff’s conjec- 
ture giving rise to CPR, Ni>’ suggested that observations of polarized astrophysical 
sources could give constraints on the CPR. However, only in 1990, the polarization 
at radio wavelengths of RGs and quasars was used for the first astrophysical test of 
CPR.!:» Ref. 12 has used the fact that extended radio sources, in particular, the 
more strongly polarized ones, tend to have their plane of integrated radio polariza- 
tion, corrected for Faraday rotation, usually perpendicular and occasionally parallel 
to the radio source axis,!® to put a limit of 6° at the 95% confidence level (CL) 
to any rotation of the plane of polarization for the radiation coming from these 
sources in the redshift interval 0.4 < z < 1.5. 

Reanalyzing the same data, Nodland and Ralston® claimed to have found a 
rotation of the plane of polarization, independent of the Faraday one, and corre- 
lated with the angular positions and distances to the sources. Such rotation would 
be as much as 3rad for the most distant sources. However, several authors have 
independently and convincingly rejected this claim, both for problems with the sta- 
tistical methods,!*:?7>! and by showing that the claimed rotation is not observed 
for the optical/ultraviolet (UV) polarization of two RGs (see below) and for the 
radio polarization of several newly observed RGs and quasars.” 

In fact, the analysis of Leahy*® is important also because it introduces a signifi- 
cant improvement to the radio polarization method for the CPR test. The problem 
with this method is the difficulty in estimating the direction of the polarization at 
the emission. Since the radio emission in RGs and quasars is due to synchrotron 
radiation, the alignment of its polarization with the radio axis implies an alignment 
of the magnetic field, which is not obvious per se. In fact, theory and magne- 
tohydrodynamics simulations foresee that the projected magnetic field should be 
perpendicular to strong gradients in the total radio intensity.”°° For example, for 
a jet of relativistic electrons the magnetic field should be perpendicular to the local 
jet direction at the edges of the jet and parallel to it where the intensity changes 
along the jet axis.‘ On the other hand, such alignments are much less clear for the 
integrated polarization, because of bends in the jets and because intensity gradients 
can have any direction in the radio lobes, which emit a large fraction of the polar- 
ized radiation in many sources. In fact, it is well-known that the peaks at 90° and 
0° in the distribution of the angle between the direction of the radio polarization 
and that of the radio axis are very broad and the alignments hold only statistically, 
but not necessarily for individual sources (see e.g. Fig. 1 of Ref. 12). More stringent 
tests can be carried out using high angular resolution data on radio polarization 
and the local magnetic field’s alignment for individual sources,’ although to our 
knowledge, only once*® this method has been used to put quantitative limits on 


bRef. 9 had earlier claimed a substantial anisotropy in the angle between the direction of the radio 
axis and the direction of linear radio polarization in a sample of high-luminosity classical double 
radio sources, but used it to infer rotation of the universe, not to test for CPR. 
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the polarization rotation. For example, Carroll,!4 using the data on the ten RGs of 
Leahy,4® obtains an average constraint on any CPR angle of a = —0.6°+1.5° at the 
mean redshift (z) = 0.78. However, the preprint by Leahy*® remained unpublished 
and does not explain convincingly how the angle between the direction of the local 
intensity gradient and that of the polarization is derived. For example, for 3C9, the 
source with the best accuracy, Leahy*® refers to Ref. 47, who however, do not give 
any measurements of local gradients. 


4. Constraints from the UV Polarization of RGs 


Another method to test for CPR has used the perpendicularity between the direc- 
tion of the elongated structure in the UV* and the direction of linear UV polariza- 
tion in distant powerful RGs. The test was first performed by Refs. 16 and 24, who 
obtained that any rotation of the plane of linear polarization for a dozen RGs at 
0.5 < z < 2.63 is smaller than 10°. 

Although this UV test has sometimes been confused with the one at radio 
wavelengths, probably because they both use RGs polarization, it is a completely 
different and independent test, which hinges on the well-established unification 
scheme for powerful radio-loud Active Galatic Nuclei (AGN).° This scheme foresees 
that powerful radio sources do not emit isotropically, but their strong UV radiation 
is emitted in two opposite cones, because the bright nucleus is surrounded by an 
obscuring torus: if our line of sight is within the cones, we see a quasar, otherwise we 
see a RG. Therefore, powerful RGs have a quasar in their nuclei, which can only be 
seen as light scattered by the interstellar medium of the galaxy. Often, particularly 
in the UV, this scattered light dominates the extended radiation from RGs, which 
then appear elongated in the direction of the cones and strongly polarized in the 
perpendicular direction.?? The axis of the UV elongation must be perpendicular 
to the direction of linear polarization, because of the scattering mechanism which 
produces the polarization. Therefore, in this case it is possible to accurately predict 
the direction of polarization at the emission and compare it with the observed one. 
This method of measuring the polarization rotation can be applied to any single 
case of distant RG, which is strongly polarized in the UV, allowing independent 
CPR tests in many different directions. Another advantage of this method is that 
it does not require any correction for Faraday rotation, which is large at radio 
wavelengths, but negligible in the UV. 

In the case of well resolved sources, the method can be applied also to the 
polarization which is measured locally at any position in the elongated structures 
around RGs, and which has to be perpendicular to the vector joining the observed 
position with the nucleus. From the polarization map in the V-band (~ 3000 A rest- 
frame) of 3C 265, a RG at z = 0.811,” the mean deviation of the 53 polarization 


©When a distant RG (z > 0.7) is observed at optical wavelengths (Agps. ~ 5000 A), these corre- 
spond to the UV in the rest frame (Aem. < 3000 A). 
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vectors plotted in the map from the perpendicular to a line joining each to the 
nucleus is —1.4° + 1.1°.7? However, more distant RGs are so faint that only the 
integrated polarization can be measured, even with the largest current telescopes: 
strict perpendicularity is expected also in this case, if the extended emission is 
dominated by the scattered radiation, as is the case in the UV for the strongly 
polarized RGs.” 

Recently, the available data on all RGs with redshift larger than two and with 
the measured degree of linear polarization larger than 5% in the UV (at ~1300 A) 
have been reexamined, and no rotation within a few degrees in the polarization 
for any of these eight RGs has been found.?° In addition, assuming that the CPR 
angle should be the same in every direction, an average constraint on this rotation 
(av) = —0.8° + 2.2° (1c) at the mean redshift (z) = 2.80 has been obtained.?° The 
same data have been used by Ref. 39 to set a CPR constraint in case of a nonuniform 
polarization rotation, i.e. a rotation which is not the same in every direction: in this 
case the variance of any rotation must be (a?) < (3.7°)?. The CPR test using the 
UV polarization has advantages over the other tests at radio or CMB wavelengths, 
if CPR effects grow with photon energy (the contrary of Faraday rotation), as in a 
formalism where Lorentz invariance is violated but CPT is conserved.*?:44 


5. Constraints from the Polarization of the CMB Radiation 


A more recent method to test for the existence of CPR is the one that uses the 
CMB polarization, which is induced by the last Thomson scattering of decoupling 
photons at z ~ 1100, resulting in a correlation between temperature gradients 
and polarization.*9 CMB photons are strongly linearly polarized, since they result 
from scattering. However the high uniformity of CMB produces a very effective 
averaging of the polarization in any direction. It is only at the CMB temperature 
disuniformities that the polarization does not average out completely and residual 
polarization perpendicular to the temperature gradients is expected. Therefore, 
also for the CMB polarization it is possible to precisely predict the polarization 
direction at the emission and to test for any CPR. After the first detection of CMB 
polarization anisotropies by Degree Angular Scale Interferometer (DASI),*° there 
have been several CPR tests using the CMB E-mode polarization pattern. 
Unfortunately, the scientists working on the CMB polarization have adopted 
for the polarization angle a convention which is opposite to the IAU one, used 
for decades by all other astrophysicists and enforced by the IAU*®: for the CMB 
polarimetrists, following a software for the data pixelization on a sphere,®° the 
polarization angle increases clockwise, instead of counterclockwise, facing the 
source. This produces an inversion of the U Stokes parameter, corresponding to 
a change of PA sign. Obviously, these different conventions have to be taken into 
account, when comparing data obtained with the different methods used for CPR 
searches. As mentioned in the introduction, all PA in this paper are given in the 
IAU convention. Independently of the adopted convention, a problem of CMB 
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polarimetry is the calibration of the PA, which is not easy at CMB frequencies. 
Although different methods are used, like the a priori knowledge of the detec- 
tor’s orientation, the use of calibration sources both near the experiment on the 
ground and on the sky, the current calibration accuracy is of the order of one 
degree, producing a nonnegligible systematic error 3 on the measured PA. In order 
to alleviate the PA calibration problem, Ref. 42 have suggested a self-calibration 
technique consisting in minimizing EB and TB power spectra with respect to PA 
offset. Unfortunately, such a calibration technique would eliminate not just the PA 
calibration offset @, but also a — 2, where a is the uniform CPR angle, if it exists. 
Therefore, no independent information on the uniform CPR angle can be obtained, 
if this calibration technique is adopted, like with the Background Imaging of Cosmic 
Extragalactic Polarization 2 (BICEP2)? experiment. 

In the following, we summarize the most recent and accurate CPR measurements 
obtained using the CMB polarization (see Table 1 and Fig. 1). The BOOMERanG 
collaboration, revisiting the limit set from their 2003 flight, found a CPR angle 
a = 4,.3° + 4.1° (68% CL), assuming uniformity over the whole sky.°* The QUEST 
at DASI (QUaD) collaboration found a = —0.64° + 0.50° (stat.) +0.50° (syst.) 
(68% CL),!! while using three years of BICEP1 data one gets a = 2.77° + 0.86° 
(stat.) +1.3° (syst.) (68% CL).4° Combining nine years of Wilkinson microwave 
anisotropy probe (WMAP) data and assuming uniformity, a limit to CPR angle 
a = 0.36°+1.24° (stat.) +1.5° (syst.) (68% CL) has been set, or —3.53° < a < 4.25° 
(95% CL), adding in quadrature statistical and systematic errors.?? The POLAR- 
BEAR collaboration? reports about a difference of 1.08° in the instrument polar- 
ization angle obtained at 148 GHz minimizing the EB spectrum and that obtained 
from their data on the Crab Nebula using the PA measurement at 90 GHz of Ref. 6. 
This corresponds to a measurement of CPR, performed with the effect of a rotation 
on the EB spectrum and using the Crab Nebula for the PA calibration, and giving 
a CPR angle a = 1.08° + 0.2°(stat.) + 0.5°(syst.), assuming that the Crab Nebula 
polarization angle does not change between 90 and 148 GHz. A consistency check 
with the value of the Cen A polarization angle measured by POLARBEAR con- 
firms this result. Recently the ACTPol (Atacama Cosmology Telescope Polarime- 
ter) team®® have used their first three months of observations to measure the CMB 
polarization over four sky regions near the celestial equator. They do not give an 
explicit value for the CPR, also because they have used the EB and TB® power 
minimization technique of Ref. 42. However it is possible to derive a value of the 
CPR from their data, since they have measured a PA of 150.9+40.6° for Crab Nebula 
(Tau A, a polarization standard source), using the EB and TB nulling procedure 
(Hasselfield, private communication). The most precise fiducial measurement at 
CMB frequencies of the Crab Nebula polarization angle is a PA of 149.9 + 0.2° at 
90 GHz.® If we assume that the Crab Nebula polarization PA would not change 


4B and TB are the cross-correlation power spectra between E- and B-modes and between tem- 
perature and B-mode. 


Table 1. Measurements of CPR. with different methods (in chronological order). 


Method CPR angle + stat. (+ syst.) Frequency or Distance Direction Reference 
RG radio pol. ja| < 6° 5 GHz 0.4<2<1.5  Allsky (uniformity ass.) 12 
RG UV pol. Ja| < 10° ~3000A rest-frame 0.5 < z < 2.63 All-sky (uniformity ass.) 16,17 
RG UV pol. ~ 3000 A rest-frame z= 0.811 RA : 176.4°, Dec : 31.6° 72 
RG radio pol. E i 3.6cm (z) = 0.78 All-sky (uniformity ass.) 14,48 
CMB pol. BOOMERanG a = 4.3° + 4.1° 145 GHz z ~ 1100 RA w~ 82°, Dec ~ 45° 64 
CMB pol. QUAD a = —0.64° + 0.50° + 0.50° 100-150 GHz z~ 1100 RA ~ 82°, Dec ~ 50° 11 
RG UV pol. a = —0.8° + 2.2° ~ 1300 A rest-frame (z) = 2.80 All-sky (uniformity ass.) 25 
RG UV pol. (6a?) < (3.7°)? ~ 1300 A rest-frame (z) = 2.80 All-sky (stoch. var.) 25,39 
CMB pol. WMAP9 a = 0.36° + 1.24° + 1.5° 23-94 GHz z~ 1100 All-sky (uniformity ass.) 33 
CMB pol. BICEP1 a = 2.77° + 0.86° 3° 100-150 GHz z~ 1100 —50° < RA < 50°, —70° < Dec < —45° 40 
CMB pol. POLARBEAR a = 1.08° + 0.2° + 0.5° 148 GHz z~ 1100 RA ~ 70°, 178°, 345°; Dec ~ —45°, 0°, —33° 3 
CMB pol. ACTPol a@=1.0° +0.63°** 146 GHz z~ 1100 RA ~ 35°, 150°, 175°, 355°, Dec ~ 50° 54,56 
CMB pol. B-mode (6a?) < (1.36°)? 95-150 GHz z~ 1100 Various sky regions 54 
CMB pol. Planck a@ = 0.35° + 0.05° + 0.28° 30-353 GHz z~ 1100 All-sky (uniformity ass.) 79 


Note: *“A systematic error should be added, equal to the unknown difference of the Crab Nebula polarization PA between 146 GHz and 90 GHz. 
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Fig. 1. CPR angle measurements by the various experiments, displayed in chronological order. 
Black error bars are for the statistical error, while red ones are for the systematic one, if present. 
A systematic error should be added to the ATCPol measurement, equal to the unknown difference 
of the Crab Nebula PA between 146 GHz and 90 GHz.*4 


between 90GHz and 146 GHz (see e.g. the discussion in Sec. 6 of Ref. 6), then 
the average CPR angle over the ACTPol equatorial regions would be the differ- 
ence between the above values a = 1.0° + 0.63° (see Ref. 54); however the above 
assumption leaves room for some systematic error. We could instead use the data of 
Ref. 56 in a different way: since the PA offset angle which they obtain from the EB 
minimization technique is —0.2° + 0.5°, i.e. consistent with zero, Ref. 56 suggests 


that their optical modeling procedure should be free of systematic errors at the 0.5° 
level or better. If these were true, then a = 0.22° + 0.32° + 0.5°.°4 In summary, for 
the ACTPol result we prefer the assumption on the constancy of the Crab Nebula 
polarization angle between 90 GHz and 146 GHz, also because this can be tested a 
posteriori and an eventual correction applied. Recently the results on CPR from the 
Planck satellite have finally been published giving a = 0.35° + 0.05° + 0.28° with 
the stacking analysis.“° Thanks to the very good quality of the Planck data, they 
achieve, as expected, a very small statistical uncertainty, considerably lower than 
any previous measurement. However their accuracy is limited by the uncertainty in 
the calibration of the position angle: even using the best calibrators, their system- 
atic uncertainty is more than 5 times larger than the statistical one. In fact, most 
likely their measurement of the CPR angle (see Table 1) is actually a measurement 
of the Planck polarization angle offset. 
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In summary, although some have claimed to have detected a rotation,4°-”® the 
CMB polarization data appear well consistent with a null CPR. In principle the 
CMB polarization pattern can be used to test CPR in specific directions. However, 
because of the extremely small anisotropies in the CMB temperature and polar- 
ization, these tests have so far used averages over relatively large regions of sky, 
assuming uniformity. 

Recently, Ref. 26 has suggested the possibility of setting constraints on the 
CPR also using measurements of the B-mode polarization of the CMB, because of 
the coupling from E-mode to B-mode polarization that any such rotation would 
produce. This possibility is presently limited by the relatively large systematic 
errors on the polarization angle still affecting current data. The result is that from 
the SPTpol (South Pole Telescope polarimeter), POLARBEAR and BICEP2 B- 
mode polarization data it is only possible to set constraints on the fluctuations 
(a7) < (1.56°)? of the CPR, not on its mean value. Ref. 54 have similarly obtained 
an upper limit on the CPR fluctuations (da?) < (1.68°)? from the ACTPol B-mode 
data of Ref. 56. By considering also SPTPol B-mode polarization data, Ref. 79 
have recently improved this upper limit to (5a?) < (0.97°)?. The one-but-last row 
of Table 1 reports the combined constraint on the CPR fluctuations obtained from 
all the B-mode data mentioned above. 


6. Other Constraints 


Observations of nearby polarized galactic objects could contribute to the CPR test, 
in particular, for those cases where polarization measurements can be made with 
high accuracy and at very high frequencies (useful if CPR effects grow with photon 
energy). Pulsars and supernova remnants emit polarized radiation in a very broad 
frequency range. For example, hard X-ray polarization observations of the Crab 
Nebula?” have been used to set a limit to CPR angle a = —1° + 11°.°? However 
this limit is not particularly stringent, both because of the low accuracy of the 
X-ray polarization measurement and because of the limited distance to the source. 
In future, more precise X-ray polarization experiments such as POLARIX,! could 
much improve the situation. 

Gamma-ray bursts (GRB) are very distant sources which emit polarized radia- 
tion both in the optical afterglow??? 31,38 
Nevertheless, they cannot be used for CPR searches, since the orientation of the 
polarization at the emission is unknown. However, they can be used to test for 


and in the prompt gamma-ray emission. 


birefringence effects, i.e. an energy-dependent rotation of the polarization angle, 
° since the detection of 
linear polarization in a gamma-ray band excludes a significant rotation of the polar- 


such as those produced by Lorentz invariance violation,° 


ization within that energy band. In this way Ref. 31 was able to put an upper limit 
to the dimensionless parameter® of this birefringence effect of € < 1 x 10~'° from 


©€ = (no)?, where no is the time component of the Myers—Pospelov four-vector na, in a reference 
frame where na = (no, 0, 0, 0).375° 
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the gamma-ray polarization of a GRB at z = 2.74. Using the same data for testing 
the Lorentz symmetry and the equivalence principle, Refs. 61 and 62 provide a 
birefringence constraint of about 107%°. 

For an issue related to CPR, Ref. 34 provides evidence that the directions of 
linear polarization at optical wavelengths for a sample of 355 quasars (0 < z < 2.4) 
are nonuniformly distributed, being systematically different near the North and 
South Galactic Poles, particularly in some redshift ranges. Such behavior could not 
be caused by uniform CPR, since a rotation of randomly distributed directions of 
polarization could produce the observed alignments only with a very contrived dis- 
tribution of rotations as a function of distance and position in the sky. Moreover, 
the claim by Ref. 34 has not been confirmed by the radio polarization directions on 
a much larger sample of 4290 flat-spectrum radio sources,?” and Ref. 35 recently 
suggested that the alignments could be due to an alignment of quasar’s spin axis 
to the structures to which they belong. The possibility that the quasar’s polariza- 
tion alignments could be due to the mixing of photons with axion-like particles is 
excluded by the absence of circular polarization.®° 

The rotation of the plane of linear polarization can be seen as different propaga- 
tion speeds for right and left circularly polarized photons (Ac/c). The sharpness of 
the pulses of pulsars in all Stokes parameters can be used to set limits correspond- 
ing to Ac/e < 1071". Similarly the very short duration of GRB gives limits of the 
order of Ac/c < 10~?!. However the lack of linear polarization rotation discussed 
in the previous sections can be used to set much tighter limits (Ac/c < 10~%?).?° 

In a complementary way to the astrophysical tests described in the previous 
sections, also laboratory experiments can be used to search for CPR. These are 
outside the scope of this paper and have not obtained significant constraints. For 
example, the PVLAS (Polarizzazione del Vuoto con LASer) collaboration has found 
a polarization rotation in the presence of a transverse magnetic field,” but later 
refuted this claim, attributing the rotation to an instrumental artifact.” The null 
result is consistent with the measurement of Ref. 15. 


7. Discussion 


Table 1 and Fig. 1 summarize the most important limits set on the CPR angle 
with the various methods examined in the previous sections. Only the best and 
most recent results obtained with each method are listed. For uniformity, all the 
results for the CPR angle are listed at the 68% CL (1c), except for the first one, 
which is at the 95% CL, as given in the original Ref. 12. In general, all the results 
are consistent with each other and with a null CPR. Even the CMB measurement 
by BICEP1, which apparently shows a nonzero rotation at the 20 level, cannot 
be taken as a firm CPR detection, since it has not been confirmed by other more 
accurate measurements. 

In practice, all CPR test methods have reached so far an accuracy of the order 
of 1° and 30 upper limits to any rotation of a few degrees. It has been however 
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useful to use different methods since they are complementary in many ways. They 
cover different wavelength ranges and, although most CPR effects are wavelength 
independent, the methods at shorter wavelength have an advantage, if CPR effects 
grow with photon energy. They also reach different distances, and the CMB method 
is unbeatable in this respect. However the relative difference in light travel time 
between z = 3 and z = 1100 is only 16%. The radio polarization method, when 
it uses the integrated polarization, has the disadvantage of not relying on a firm 
prediction of the polarization orientation at the source, which the other methods 
have. In addition, the radio method requires correction for Faraday rotation. All 
methods can potentially test for a rotation which is not uniform in all directions, 
although this possibility has not yet been exploited by the CMB method, which also 
cannot see how an eventual rotation would depend on the distance. Reference 28 
have recently examined the dependence of CPR on the wavelength and on the 
distance of the source, and found none, which is not surprising for a null (so far) 
CPR: in practice, they cannot improve the limit already set on the birefringence 
parameter € in Ref. 31 (see Sec. 6). 


8. Outlook 


In the future, improvements can be expected for all methods, e.g. by better targeted 
high resolution radio polarization measurements of RGs and quasars, by more accu- 
rate UV polarization measurements of RGs with the coming generation of giant 
optical telescopes,®:?1:" and by future CMB polarimeters such as BICEP3* and 
COrE+.®° In any case, since at the moment the limiting factor for improving the 
constraints on the CPR angle with the CMB are the systematic uncertainty on the 
calibration of the polarization angle, it will be necessary to reduce these, which at 
the moment is at best 0.3° for CMB polarization experiments. The best prospects 
to achieve this improvement are likely to be more precise measurements of the 
polarization angle of celestial sources at CMB frequencies, e.g. with the Australia 
Telescope Compact Array®? and with Atacama Large Millimeter/Submillimeter 
Array (ALMA),® and a calibration source on a satellite.4! 
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Recent progress in the domain of time and frequency standards has required some impor- 
tant improvements of existing time transfer links. Several time transfer by laser link 
(T2L2) projects have been carried out since 1972 with numerous scientific or techno- 
logical objectives. There are two projects currently under exploitation: T2L2 and Lunar 
Reconnaissance Orbiter (LRO). The former is a dedicated two-way time transfer exper- 
iment embedded on the satellite Jason-2 allowing for the synchronization of remote 
clocks with an uncertainty of 100 ps and the latter is a one-way link devoted for ranging 
a spacecraft orbiting around the Moon. There is also the Laser Time Transfer (LTT) 
project, exploited until 2012 and designed in the frame of the Chinese navigation con- 
stellation. In the context of future space missions for fundamental physics, solar system 
science or navigation, laser links are of prime importance and many missions based on 
that technology have been proposed for these purposes. 


Keywords: Clock; time transfer; laser link; laser ranging; event timer. 


1. Introduction 


Instrumentations allowing for the comparison of distant clocks and for the distribu- 
tion of time scales have some important applications in metrology, navigation and 
fundamental physics. Recent progress in the domain of time and frequency stan- 
dards has required some important improvements of existing time transfer links. 
The most suitable technique available today to realize the best time transfer in free 
space is based on the propagation of laser pulses. Such a time transfer can be used 
to compare some clocks in space or to realize some comparison between ground 
and space or between several users on ground. These optical time transfers rely on 
the existing laser ranging network natively developed to measure distances between 
satellites (or the moon) and the ground. The principle of that technique is based 
on the two-way time of flight measurement of picosecond laser pulses from ground 
stations to retro-reflectors (passive) on satellite. The distance between the station 
and the satellite is computed for every laser pulse emitted by the ground station 
and reflected by the satellite from the time interval between the start and return 
epochs (Fig. 1). The accurate distance between the satellite and the reference point 
of the ground station is obtained by a permanent optical calibration at ground and 
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Retro-reflectors -~—. 
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Fig. 1. Laser ranging principle. The laser pulses are emitted and received by the ground laser 
station. Some retro-reflectors on the satellite reflect a fraction of the incident laser pulse. Usual 
parameters: Laser rate: 10 Hz (up to 2kHz); Wavelength: 532 nm; FWHM: 50 ps. (For color version, 
see page I-CP4.) 


knowledge of the geometry of the satellite and of its mass distribution. The location 
of the laser station on earth and the orbitography of the satellite are computed in 
the International Terrestrial Reference Frame (ITRF).! By using a dedicated active 
space instrument able to record arrival epochs of laser pulses on the satellite, these 
satellite laser ranging technologies can be extended to realize a ground to space 
time transfer. 

Two kinds of time transfer can be envisioned: The first, hereafter called the two- 
way, is based on a transfer with both an uplink and a downlink allowing the mea- 
surement, by the process itself, of the propagation delay. The second, called the 
one-way, is based on a single link with a propagation delay deduced and computed 
from the distance between the clocks. The two-way can be done either through 
a simple reflection of the laser beam on the corner cube together with the active 
space instrument to time tag the laser pulses, or through an active equipment using 
a synchronous or asynchronous transponder.? The passive two-way is well suited 
for high precision time transfer over distances of a few tens of thousands kilometers 
while the one-way is more suited for time transfer at very large scale (Solar Sys- 
tem). The distance Earth—Moon is typically the maximum distance which can be 
envisioned in a passive two-way link. Figure 2 is an example of a two-way link in 
Earth orbit allowing for time transfer between a ground station and a space vehicle. 

Based on the scenario depicted in Fig. 2, ground to ground time transfer can 
be performed with a unique space instrument and several ground laser stations. It 
can be realized in either a common or a noncommon view mode. In the first case, 
the satellite is seen during a common period while in the other one, the satellite 
is observed alternatively (Fig. 3). As a function of the space clock used (quartz, 
H-Maser,...), the performance of the ground to ground link in the noncommon 
view mode can be significantly affected by the noise introduced by the space clock 
during the nonobserved period T. 
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Detection 
~ Clock 


Fig. 2. Ground to space time transfer based on a two-way link in Earth orbit with a laser station 
on ground and an active space equipment linked with a retro-reflector. (For color version, see page 
I-CP4.) 


Fig. 3. Ground to ground time transfer in a common view mode (left) and in a noncommon view 
mode (right). (For color version, see page I-CP4.) 


Several time transfer by laser link (T2L2) projects have been tested since 1972 
with numerous scientific or technological objectives. 

The first two-way optical time transfer experiment was proposed to European 
Space Agency (ESA) in 1972 with the Laser Synchronization from stationary Orbit 
(LASSO) project. The mission was a significant success due to some intercon- 
tinental ground to ground time transfers with time stabilities in the range of 
100 ps over several thousand seconds. The project was embedded on a geostationary 
satellite (Meteosat P2) launched in 1988. The first common time transfer session 
between two remote laboratories (Observatoire de la Céte d’Azur (OCA) France 
and TUG Austria) was carried out in 1989. The primary objective of LASSO was to 
demonstrate the feasibility of this novel time transfer technique with an uncertainty 
better than 1 ns. 

The following operational laser time transfer was Laser Time Transfer (LTT). 
The project was developed by the Shanghai Astronomical Observatory (SHAQ) in 
the frame of the Chinese global navigation system Compass.* A first LTT equipment 
was launched in spring 2007 onboard the Compass M1 satellite,>.© and two other 
ones in summer 2010 and spring 2011. The equipment included a very simple and 
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rugged design using a single photon detector, a band pass filter and no other optical 
component.’ The detection concept allowed for minimizing the intensity-dependent 
delay of the detection whatever the energy emitted (single photodetection con- 
cept). The first ground to space time transfer based on LTT was done from the 
Changchun Satellite laser ranging station (China).® The time scale synchronization 
was performed with a data spread of 260 ps and a measurement of the frequency 
differences between the ground and the space segments with an uncertainty of 
3x10"? 

The first active two-way laser ranging experiment at planetary distance! was 
carried out in spring 2005 with the MESSENGER spacecraft by using the embedded 
laser altimeter MLA.!! The experiment was based on a double asynchronous one- 
way transponders operated from the ground and from the spacecraft. The experi- 
ment allows for providing range and time transfer between the spacecraft and the 
Goddard Geophysical and Astrophysical Observatory laser station. The measure- 
ments were made at a distance of 24 million km. A second planetary link was done 
at a distance of 80 million from the altimeter of the Mars Global Surveyor (MGS) 
spacecraft. This second link was obtained in a one-way uplink configuration. 

The third laser link experiment conducted outside the terrestrial orbit was per- 
formed with the Lunar Reconnaissance Orbiter (LRO).1?13 The objective was 
to obtain routine one-way laser ranging with the Lunar Orbiter Laser Altime- 
ter (LOLA) for demonstrating spacecraft orbit determination. The experiment 
has been in successful operation since summer 2009. Data acquisition from the 
laser ranging network was stopped on Oct 1, 2014 but scientific analysis is still 
ongoing. 

In 1994, OCA proposed to build a new generation of optical transfer called 
T2L2.1415 As compared to LASSO, the objective was to improve the performances 
of both time stability and accuracy by at least two orders of magnitude, and enlarge 
the number of participating laser stations. After several proposals for the satel- 
lites GIOVE (Galileo program), Myriade and the MIR space station,!4 T2L2 was 
accepted in 2000 in the frame of the Atomic Clock Ensemble in Space (ACES) pro- 
gram!®!7 but unfortunately taken off the mission in 2001 for some problems related 
to maximum power allowed and mass budget. After T2L2 on ACES was abandoned 
in 2001, T2L2 was finally accepted in 2005 as a passenger instrument on Jason-2, an 
altimetry satellite designed to study the internal structure and dynamics of ocean 
currents.!8 2° T2L2 was launched in June 2008 and has been running without any 
significant interruption since that date. It was placed by a Delta launcher at an alti- 
tude of 1336 km and an inclination of 66°. The T2L2 project is currently supported 
by both the OCA-GeoAzur and the French space agency CNES. 

Section 2 is an overview of the scientific objectives associated to clock compar- 
isons by laser. Sections 3 and 4 are a global description of the two projects currently 
under exploitation (T2L2 and LRO) and Sec. 5 is a nonexhaustive description of 
future projects involving a laser link. 
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2. Scientific Objectives 


As compared to classical microwave techniques, laser links have many major advan- 
tages for both Earth orbiting and Solar System missions. The most significant ben- 
efits are: 


— A very high frequency of the carrier (optical) allowing for high bandwidth 
modulation. 

— A good correction of the refraction delay induced by the atmosphere. Ionosphere 
uncertainty is negligible and the tropospheric correction can be determined 
through some atmospheric measurements (pressure, temperature, humidity). 

— Clear and well-defined reference points for both space and ground segments. 
A microwave antenna of 70m used for a mission in the Solar System can be 
replaced by an optical telescope having an aperture in the range of 1m. 

— A very well-focused beam allowing the use of some compact collectors (relevant 
for the space segment). 


Laser links in space allow for a large number of tests in many fields such as time and 
frequency metrology, navigation, fundamental physics and Solar System physics. It 
is clear that the exact science doable with a given mission depends on the precise 
experimental setup of the considered project (atomic clock, accelerometer, inter- 
ferometer, or trajectory measurement system in the Solar System), and we cannot 
give an exhaustive list of what we could measure. This section is a brief overview 
of the scientific objectives intentionally limited to laser links in space. 


2.1. Time and frequency metrology 


The time and frequency metrology has developed continuously for more than 60 
years with regular improvement of both time accuracy and time stability. It has 
benefited from scientific advances in several domains such as atom laser cooling, 
optical clocks and frequency comparison with femtosecond laser combs.?! 73 

For instance, the fractional accuracy of Cesium microwaves clocks has been 
improved by a factor ten every ten years since 1950. Today cold atoms clocks 
routinely obtain fractional accuracies in the range of a few 10~!° and instabilities 
better than 1.5 x 107!° over more than ten days.?* 

Progress made in the domain of optical clocks is also very impressive with an 
improvement of two orders of magnitude every ten years from the 1980’s until now. 
It has been encouraged by the recent advances in atom manipulation and in the 
optical combs that have allowed establishing a connection between the microwave 
and optical domains. Today, there are more than ten laboratories working on such 
optical clocks worldwide. Some of them have a fractional frequency uncertainty of 
only a few 10~'§ (Ref. 25) (Fig. 4). The oscillation frequencies of these optical 
standards are of several hundred terahertz (from IR, to UV). Up to now, there is 
no electronic equipment fast enough to directly use these 100 terahertz signals. The 
usual solution for establishing the connection between this optical domain and the 
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Fig. 4. Evolution of the frequency uncertainty of microwave and optical clocks. 
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Fig. 5. Time stabilities (root square of the time variance) of some typical clocks. PHARAO will be 
the first cold atom space clock on ISS in 2017. DORIS USO is an ultrastable quartz oscillator used 
for both the T2L2 space instrument and the DORIS System. The T4S H-Maser is a commercial 
clock from T4Science. (For color version, see page I-CP4.) 


classical gigahertz domain (electrical signals propagated in some coaxial cables) 
lies in the use of optical frequency combs generated from a high repetition rate 
femtosecond laser.?? 

Figure 5 shows the time stability (TDev) of some typical clocks using different 
technologies. Quartz oscillators are only stable for short time intervals. They are 
often long-term disciplined on some other sources such as atomic transitions. They 
can also be used in some applications where short-term stability is required for the 
detection of rapid changes (Doppler detection). H-Masers are extremely stable over 
mid-term time intervals (up to 10,000s). They can be used for instance in some 
microwave link in the Solar System or for the localization of extra galactic sources 
(VLBI). Optical clocks are today extremely stable over short- and mid-term time 
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intervals. The very high frequency of the carrier is a major advantage for ultra high 
time stabilities. 

As well, the development of space clocks is a major issue for the field of time 
and frequency metrology. The uncertainty of gravitational potential at the Earth 
surface is a limitation to go beyond a fractional frequency uncertainty of 1071”. The 
access to the space environment allows for decreasing that noise and will become 
of prime importance for the next future. 

Furthermore, all these improvements imply some time and frequency transfer 
methods well suited to the performances of these clocks. Common View GPS (GPS 
CV) and Two-Way System Time and Frequency Transfer (TWSTFT) are the most 
common techniques currently used to achieve comparisons between clocks and dis- 
tributions of time and frequency references. GPS CV used in the P3 ionosphere 
free linear combination mode permits to obtain an expanded uncertainty estimated 
between 3ns to 7ns. This result takes into account a numerical factor k = 2 used 
as a multiplier of the standard uncertainty (coverage factor k = 2).7°?" The future 
European Galileo System will obtain the same kind of performance. 

Two-way laser time transfer allows for an enhancement of both the accuracy and 
time stability of more than one order of magnitude as compared to the classical 
microwave techniques. Measurements carried out on the T2L2 project for a ground 
to space link showed a time stability 0, (root square of the time variance) given by 


o2(r) = (65.1071? x 772)? + (2.1074 x r+2)?, 7m =18, (1) 


where 70 is the time interval between consecutive laser pulses. It is the sum of a 
white phase noise (oa ?) mainly induced by the repeatability error of the optical 
detectors together with a white frequency noise (ot V/ ?). The typical expended 
uncertainty between time scale of two distinct clocks A and B ug(Aap) is (coverage 
factor k= 2)78: 


ugz(Aap) < 100ps. (2) 


With these performances, laser time transfer is perfectly well suited to validate the 
other time transfer techniques such as GPS, TWSTFT and also future technologies 
currently under studies (optical fiber, MWL (Micro Wave Link) instrument on 
ACES). 

In the time and frequency metrology domain, one relevant objective is to partici- 
pate in the improvement of time scales. The most crucial of these is the International 
Atomic Time (TAI) built through an ensemble of ultrastable clocks located all over 
the world and synchronized to each other with some microwave time transfer tech- 
niques. A two-way laser link is a completely independent technique perfectly well 
suited to calibrate and validate these microwave links. 

Ground to space laser time transfer is also essential to validate clocks in space. 
This is the case for ACES, where European Laser Timing (ELT) will allow to give 
an independent comparison of both onboard H-Maser-PHARAO clocks, and for 
T2L2 on Jason-2 equipped with the DORIS quartz oscillator.?9°9 T2L2 is able to 
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measure the performance of the DORIS system quartz oscillator over an integration 
of roughly 20s. It gives the opportunity to detect some possible disturbances of the 
oscillator caused by radiations (South Atlantic Anomaly). Up to now, some effects 
have been clearly highlighted by T2L2. In the frame of the Jason-2 mission, the 
perturbations are not high enough to justify a correction model for the DORIS 
navigation but these effects are of prime importance for quartz technologies in 
space. Laser time transfer is moreover used to evaluate the performances of the 
atomic clocks for the GNSS navigation. The LTT project is currently developed in 
the frame of Compass for this reason and several other proposals (T2L2 and LTT) 
have been made in the frame of the Galileo program.® Some other new developments 


such as OPTI are currently run for GNSS purposes in general.*! 


2.2. Fundamental physics 


The possibilities given by an accurate and stable time transfer between remote 
clocks is also of interest for the domain of fundamental physics. These laser 
links should contribute to several distinct fields such as the search for a possible 
anisotropy of the speed of light,??'37 the measurement of the gravitational redshift,>4 
or the measurement of the Eddington’s parameter .2° 

Laser time transfer on a satellite implies some optical propagation in different 
directions during the satellite pass. The detection of a possible anisotropy of the 
speed of light can be performed from the comparison between ground and space 
clocks as a function of the geometry used during the time transfer with the satellite 
(Fig. 6). 

Two scenarios can be envisioned. The former is based on a given link built 
between the satellite and a ground station. The location of a station on the ground 
and the trajectories of the satellite passes are chosen to optimize the global geometry 
of the test for a given direction. Figure 7 is a geometry example applied to the T2L2 
project showing the successive passes of the Jason-2 satellite above Europe. 


Fig. 6. Delay comparison between the uplink and the downlink for various laser beam orienta- 
tions. 
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Fig. 7. Successive passes example of the Jason-2 satellite above Europe. Obtained from the T2L2 
website and developed by the data mission center. 


In cases when the stability of the onboard clock is not high enough, one given 
or several ground clocks can monitor the space clock from the ground using a H- 
Maser. The latter is set on numerous ground to space time transfer acquisitions in 
all possible orientations in order to decrease the uncertainty introduced by both the 
optical link and the onboard clock, and to provide global orientation coverage. This 
ground to space time transfer is carried out using several ground stations equipped 
with ultrastable clocks such as H-Masers. In both scenarios, we get: 


aa Cel) 
cC — -2Tprop(1 — cos())’ (3) 


where o,, is the time deviation of the global link (clocks + laser link), 7 is the time 
integration of the observation, Tprop is the propagation delay and @ is the angle 
scanned by the laser beam. In cases when only a single measurement is realized 
for a given orientation, we have to consider in Eq. (3) a time stability ox of the 
ground-space link over a time integration 7 corresponding to a satellite pass Tpass. 
In cases when numerous acquisitions Nacq are done over a long time Tacq (for 
instance 1 year), a part of the uncertainty corresponding to each acquisition can be 
reduced by averaging. If there are no systematic effects, 0x could be divided by a 
factor,/Nacq- In that case, it is necessary to consider in Eq. (3) the quadratic sum 
of that term as well as those of the term corresponding to the time stability of the 
global link over T = T’cq. This latter term may be predominant. 

When the single acquisition mode is applied to T2L2 on Jason-2, Tpass is roughly 
1000s. In some optimal measurement condition, T2L2 gives a time stability ox 
over 1000s equal to: ox (1000s) = 50ps (see Sec. 4.2). The quantity dc/c then 
can be determined with an uncertainty of roughly 10~9. Regarding the numerous 
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acquisitions mode, implemented with T'acq = 10 days, we obtain a time stability 
of the whole link in the range of 10 ps which could allow for determining 6c/c at 
2 x 107!° level. 

The redshift measurement can be done either with a very eccentric orbit or 
with some accurate clocks. For instance, in the frame of ACES, by using the high 
frequency accuracy of the PHARAO space clock the test can be measured with 
a relative uncertainty of 10~!°. In the frame of T2L2 on Jason-2, both the poor 
accuracy of the embedded quartz oscillator (DORIS) and the quasi-circular orbit of 
Jason-2 (< 107%) do not allow for measuring any relevant value. The uncertainty 
on the redshift could be improved by a factor 10* for an experiment in the Solar 
System as compared to the test that would be measured by ACES. 

The Eddington parameter y first introduced by Eddington, Robertson and 
Schiff, measures the amount of spatial curvature produced by mass. For example, 
the general relativity predicts a y value equal to one, and the scalar—tensor theo- 
ries express this PPN parameter as: y = (1+ w)/(2+w), where w is an arbitrary 
coupling function that determines the strength of the scalar field. The w function 
could be very large so that the theory’s predictions could be almost identical to 
general relativity as defined today. But w could take values that would lead to sig- 
nificant differences in cosmological models. Irwin I. Shapiro predicted a relativistic 
time delay Atgnapiro in the time of flight of photons propagating in a gravity field 
(Shapiro delay). 7 can be deduced from the measurement of that Shapiro delay with 
the strong gravity field of the Sun.*° It can be measured with a spacecraft having 
a solar orbit chosen so that the satellite passes behind the Sun viewed from the 
Earth. The Shapiro delay is at a maximum near the occultation with a variation 
as large as 120 ys in a few days during the occultation. It can be enough to only 
measure a signature of that delay instead of an absolute delay in order to relax 
the constraint of the long-term time stability of the clocks involved or to use a 
simple one-way link. The sensitivity of the measurement is at a maximum during 
the conjunction phase of the spacecraft with the Sun. The relative uncertainty on 
7y can be evaluated by: 


oy _ _ Tx (7) 
Y AtgshapiroT , 


(4) 


where o, is the time stability of both the optical link and the clocks and 7 the 
integration duration of the measurement. With a time stability o, in the range of 
10-14 . 71/? over one day, (1 — y) can be assessed at the 10~7 level. 


2.3. Solar System science 


The Solar System science relies on the ranging of a given spacecraft in orbit around 
the Sun or in orbit around a planet. With a spacecraft in orbit around a planet (or a 
satellite), we can study the gravity field of the planet from the precise trajectory of 
the vehicle. It is of interest for the mass of the planet, the structure of the gravity 
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field or for studying volcanoes. During occultation phases of the orbiter by the 
planet, the light beam goes through the atmosphere (if there is an atmosphere) 
which generates a variation of the time propagation. In the case of the Mars atmo- 
sphere, the delays involved can reach a few nanoseconds when the distance between 
the light beam and the planet surface tends towards zero. If the spacecraft orbit is 
known during this phase, the analysis of the time propagation variation together 
with the variation of the laser flux allows for extracting some atmospheric param- 
eters. 

Laser ranging in the Solar System allows also for the determination of masses 
and density of asteroid and the determination of the solar quadruple moment 
parameter J2 of the Sun. Here again, ranging can be carried out with a one-way link 
since most of these parameters can be deduced from a signature on the trajectory 
of the spacecraft. 


2.4. Solar System navigation based on clock comparison 


Classical navigation in the Solar System is usually done with the microwave links 
operated from the large antennas of the Deep Space Network (DSN) facility. An 
interesting alternative is to use laser links. The direct information delivered by a 
clock comparison is a time flight and a radial distance. This can be achieved at a 
centimeter level for a time integration of a few thousand seconds. Tracking carried 
out with several synchronized Earth laser stations allows also for measuring the 
angular position of the spacecraft (localization in perpendicular plane to the line of 
sight). This can be done with differential measurements of the arrival time onboard 
allowing for the geometry of the system to be resolved. Since the measurement done 
at the spacecraft is differential, no long-term stability is required here for the space 
instrument. With distances between ground stations in the range of 10,000km, an 
uncertainty on the positioning of the station at the centimeter level, and a time 
synchronization between the stations of 30ps (obtained from a classical two-way 
laser like T2L2), an uncertainty on the angular determination of a few nanoradians 
is doable. Two-way laser time transfer is one unique chance to validate this kind 
of one-way laser ranging concept by directly comparing the one-way and two-way 
links together. 


3. Time Transfer by Laser Link: T2L2 on Jason-2 
3.1. Principle 


Basically, T2L2 allows for the synchronization between a ground clock linked to a 
laser station and a clock onboard a satellite. For a ground time transfer between 
several clocks, several elementary links between ground and space are made and 
the space segment is only used as a relay between the clocks on ground. To perform 
a T2L2 time transfer, the laser station emits light pulses (20 ps to 200 ps) toward 
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Fig. 8. T2L2 principle. For every laser pulse, the laser station measures the start epoch te and 
the return epoch t, after reflection from the space. The T2L2 payload records the arrival epoch 
onboard tp. 


the satellite. A Laser Ranging Array (LRA) onboard the satellite returns a fraction 
of the received pulses to the ground station. The ground station measures, for each 
laser pulse, the start epoch t. and the return epoch t, after reflection from the 
space. The T2L2 payload records, in the time scale of the space oscillator, the 
arrival epoch onboard tp (Fig. 8). 

These data are downloaded to ground with a classical microwave link within 2h 
following the record. The differences between the start and return epochs recorded 
at ground level allow for determining the propagation delay of the transfer. 

In the framework of T2L2 on Jason-2, the maximum distance between the sta- 
tions in common view mode is roughly 6000 km. 


3.2. Laser station ground segment 


The ground segment is based on an international network including more than 40 
laser ranging stations.?’ The activities of that network are organized under the 
International Laser Ranging Service (ILRS)?° which provides global satellite and 
lunar laser ranging data to support research in geodesy, geophysics, Lunar science 
and fundamental physics. That network continuously monitors the distance of more 
than 40 satellites orbiting around the earth. The distances provided by the station 
are based on the measurement of time of flight of very short laser pulses. The laser 
pulses are sent and received by the station and reflected by some corner cubes 
embedded on the satellite. A typical station (Figs. 9 and 10) is composed of the 
following elements: 


— A pulsed laser to generate pulses between 10 ps to 200 ps (FWHM) at a 10 Hz 
rate and 532nm (YAG doubled). 
— A start detector to get the start events at the output of the laser. 
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Fig. 9. Typical laser station, using the same telescope for both laser emission and reception. 
Some stations use separate apertures for emission/reception. 


Fig. 10. Photography of the MeO laser station at Grasse (France) built by the end of the seventies 
for lunar laser ranging and redesigned for satellite and time transfer in 2005. (For color version, 
see page I-CP5.) 


— A telescope to receive the light pulses reflected by the satellite and possibly to 
emit laser pulses toward the satellite. 

— A return detector to detect the pulses reflected by the satellite. 

— An event timer to timestamp the events in the time scale of the ground clock 
to be synchronized. 


In a classical laser station, only the time of flight between the ground and 
the satellite matters. For time transfer purposes, it is also necessary to measure 
accurately the absolute start time of each laser pulses emitted. This is achieved 
with some picosecond event timers for the emission epoch and the reception epoch. 
In order to obtain the picosecond accuracy required for T2L2, laser stations are 
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Table 1. Main characteristics of a typical laser station. 


Subsystem Characteristics 
Telescope diameter lm 
Telescope slew rate 5° st 
Telescope pointing accuracy 5 arcsec 
Laser energy 100 mJ 
Laser wavelength 532.1 nm 
Laser FWHM 100 ps 
Laser rate 10 Hz 
Event timer standard deviation 5ps RMS 
Photodetection standard deviation 50 ps RMS 
Clock stability ADev (H-Maser) 2.10715 


calibrated with a dedicated calibration station. This calibration is based on some 
simultaneous measurements done between the usual chronometry of the laser station 
and the dedicated calibration station specifically installed for that purpose. It allows 
the measurement of the delay between the optical pulse at the mechanical axis of the 
telescope and the time reference of the station. The calibration station gathers inside 
unique equipment for all the metrology required to perform that measurement: a 
sub-picosecond event timer, an optical module to grab laser pulses from the laser 
station and an optical fiber. 

Main characteristics of a typical laser station are given in Table 1. 

Among all laser stations of the international network, 20 are currently con- 
tributing actively to T2L2 and ten have the picosecond resolution required for high 
performance time transfer. Table 2 lists these laser stations together with the type 
of atomic clock used. 


3.3. Space instrument 


The T2L2 space equipment has been embedded on the satellite Jason-2 as a passen- 
ger instrument. Roughly, it is an instrument able to timestamp laser pulses coming 
from the Earth at the picosecond level. It comprises a photodetection device and an 
event timer connected to an ultrastable quartz oscillator used as the T2L2 onboard 


Table 2. T2L2 participating laser station. 


Name Country ILRS N° Clock 
Changchung China 7237 H-Maser 

Grasse France 7845 H-Maser 
Herstmonceux England 7840 H-Maser 
Koganei Japan 7308 Atomic Fountain 
Mc Donald USA 7080 Cesium 

Matera Italy 7941 H-Maser 

Mount Stromlo Australia 7825 Cesium 
Potsdam Germany 7841 Quartz slaved on GPS 
Wettzell Germany 8834 H-Maser 
Zimmerwald Swiss 7810 H-Maser 
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Fig. 11. T2L2 global architecture with unit A outside the satellite oriented toward the Earth 
and unit B inside the payload. The DORIS USO and the LRA equipment do not belong to the 
T2L2 assembly. The laser pulse coming from the ground station is on the lower side of the figure. 


clock. The reflection of the laser pulse towards the Earth is done with a retro- 
reflector array. The oscillator and the retro-reflector are used by T2L2 but are not 
part of the T2L2 package. The oscillator is the frequency reference®® of the DORIS 
navigation system, and the LRA is used for the laser ranging satellite positioning 
system of Jason-2 (Fig. 11). The space instrument is divided into two parts A and 
B. The A unit includes the photodetection while the B unit contains the event 
timer, some parts of the detection, the power supply and the microcontroller. 

The photodetection device is made with two avalanche photodetectors (unit A 
in Fig. 11). One of them runs in a nonlinear mode for chronometry,*° the other in a 
linear mode to trigger the nonlinear detector and to measure the laser energy. The 
primary function of the nonlinear photodetection is to generate an electrical pulse 
from a very weak pulse having a time uncertainty as low as possible. The arrival 
epoch of the laser pulse is obtained from the time tagging of that electric pulse 
with the event timer (unit B). The internal delay of the detector has a significant 
dependency on the energy received. In order to eliminate the temporal noise that 
would be introduced by some uncontrolled energy variation, this transit delay has 
to be compensated. This is achieved for each pulse received through the linear 
photodetector. 

The energy of each pulse is recorded together with the arrival epoch measured 
by the event timer and the compensation is applied through a post treatment on 
ground. The laser energy received at the satellite plane is sent to the photodetectors 
through a dedicated optics assembly. These optics allow to limit the field of view 
of the instrument (Sun protection), to adjust the photon number and to limit 
the spectral bandwidth. Each photodetection chain (linear and nonlinear) has a 
distinct optical assembly. The optics of the linear detection is gathered with the 
linear detector into a unique module. The optics of the nonlinear detection is divided 
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Fig. 12. Photography of units A (right) and B (left) of the T2L2 space instrument. The cylinders 
on the right are the detection modules (linear and nonlinear). The LRA module is not integrated 
into the photo. (For color version, see page I-CP5.) 


into two subsystems linked together with an optical fiber (between unit A and B). 
Figure 12 is the photography of both units A and B. 

In order to minimize false detections, the nonlinear photodiode is gated a few 
nanoseconds before the expected arrival of the laser pulses with an external active 
quenching circuit trigged from the linear avalanche photodiode. The delay between 
the applied voltage and the arrival time of the laser pulse is made with a few meters 
optical fiber playing the role of a delay line. The linear detector is used to measure 
both the laser energy and the DC level produced by the sunlight backscattered by 
the Earth. These measurements allow to compensate the time walk of the nonlinear 
photodiode and to adjust automatically the threshold of the trig system (day, night). 

The energy density received at the satellite depends on the distance between 
the ground station and the satellite which also depends on the incident angle p 
between the beam and the optical axes of the instrument through a reversible law. 
This reversibility allows for compensating the variation of the energy density during 
the pass of the satellite over the laser station with an optical device generating a 
variation depending on p. This is achieved with a neutral density device having 
a transmission depending on the incident angle of the laser beam. Among other 
things, the device allows for minimizing the solar flux backscattered by the Earth 
and for equalizing the optical flux received onboard whatever the position of the 
satellite. Each photodetector channel includes an interference filter to improve the 
signal to noise ratio of the detection. It is tuned to the nominal wavelength of 
the doubled Nd:YAG laser (532.1nm). Each channel has also a collimation optic 
permitting to adjust the field of view to the angular size of the whole Earth seen 
from the satellite. The reflection function of T2L2 is obtained from the pyramidal 
LRA unit made with one central corner at the top and eight corner cubes on the 
periphery. Because the LRA and detection unit are not located at the same place, 
the reflection and detection points do not coincide. The projected distance between 
these points on the line of sight is computed through a post treatment on ground. 
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Table 3. DORIS oscillator characteristics mea- 
sured before integration. 


Characteristics Measurement 
Time stability (TDev) <lps at 1s 

2ps at 10s 

20 ps at 100s 
Aging <1x 10~!! day! 
Thermal sensitivity 6.5 x 10-18 K-1 
Acceleration sensitivity 7.6 x 10-19 gl 
Radiation sensitivity 6.7 x 10-!? rad~! 


The time reference used onboard is the quartz oscillator of the DORIS equipment. 
It comprises a dewar to protect both the sensitive electronics and the resonator from 
temperature fluctuations. Its main characteristics measured before integration, are 
given in Table 3. 

The quartz oscillator is connected to the event timer allowing to get the times- 
tamping of all incoming events. It can be considered as an ultrahigh speed counter 
made with an analog Vernier and a low frequency digital counter (100 MHz). The 
Vernier gives the time tag information with a time resolution of 100 fs and a tem- 
poral dynamic range of 20ns while the digital counter has a time resolution of 
10ns and a temporal dynamic range of more than five years. Both Vernier and 
digital counter are driven by a frequency synthesis module designed to translate 
the 10 MHz DORIS signal to 100 MHz. 

Locations of units A and B on the Jason-2 satellite are depicted in Fig. 13. 

Main characteristics of the T2L2 space instrument are given in Table 4. 


3.4. Time equation 


We consider a laser station capable to emit—receive a laser pulse and a satellite 
capable to reflect the pulse and to measure the arrival time onboard. We note t,. 
and t, respectively the emission and reception epochs of laser pulses measured at 
the laser station and t, the reception epoch measured at the satellite. Hundred 


Fig. 13. CAO view of the whole Jason-2 satellite. T2L2 instrumentation is shared into two units 
A and B respectively outside and inside the satellite. (For color version, see page I-CP5.) 
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Table 4. Main characteristics of the T2L2 space instrument on 


Jason-2. 

Subsystem Characteristics 
Mass of unit A 1.2kg 

Mass of unit B 8kg 

Volume A 160 x 116 x 103 mm? 
Volume B 270 x 280 x 150mm? 
Power consumption 42W 

Optical detection wavelength 532.1 nm 

Detection threshold 0.3uJ-m~? 

Field of view 55.1° 

Photodetection standard deviation 30 ps RMS at mid Energy 
Event timer standard deviation 2.8 ps RMS 

Quartz Oscillator TDev 20 ps at 100s 


years ago, Poincaré and Einstein defined a procedure to synchronize some clocks 
in different places. Applied to our ground to space time transfer described in an 
inertial frame, the satellite clock has to be compared by taking into account the 
simple propagation delay pot. to synchronize the clocks. Using this procedure in 
a different inertial frame moving with respect to the first frame, we get a different 
synchronization. With the assumption of principle of relativity and constant light 
velocity, Poincaré and Einstein derived the Lorentz transformation. Although this 
looks very simple from our present point of view, conceptually it was a firm step 
forward at that time. With acceleration and gravity and other corrections, the time 
offset Aas between a ground clock A and the clock in space can be described by 


tp be 


Aas =tet+ —t+C 
te oF tr Cya, 
= 5 of 7 + CRel + Catm + Cical + CEcal; (5) 


where Cgag is the Sagnac correction, CRe the relativistic frequency shift, Catm an 
atmospheric correction and Cjca) and Ceca are some calibration terms for the laser 
station.“ 

The term Cgag is computed from the coordinates x and X of the satellite and 


the station for each emission epoch. It is given by 
2 ; 
Crag = lu — X)-X. (6) 


Csag has an amplitude of roughly 10 ns. Cre) represents the relativistic frequency 
shift of the space oscillator integrated as a function of time. It is computed from 


1 (a? Gm. 
Cra = | 3 ($- : ) at, (7) 


where m- is the mass of the earth, G is the gravitational constant and « is the 
velocity of the satellite. This relativistic frequency shift includes a periodic part 
having an amplitude of roughly 100 ps integrated over a whole orbit. 
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Catm is a correction introduced by the atmosphere for the optical path difference 
between the uplink and the downlink. The time delay for the one-way link is roughly 
40 ns and the time correction C'atm between the uplink and the downlink is up to 
1.5 ps. 

Cical is required to get the absolute time of flight between the reference point 
of the station and the satellite. It is computed from epochs measured with some 
laser pulses sent onto the reception chain of the station through a retro-reflector 
located in the laser beam (Fig. 9) at a given distance d., to the spatial reference of 
the station (cross axes of the telescope mount): 


Cical = {te _ tz) Nical _ aes (8) 


where t; is the reception epochs of Nica) laser pulses reflected by the retro-reflector 
and dcc is the delay corresponding to the free space propagation between the corner 
cube and the space reference of the laser station. 

The external calibration Cc, allows for setting the absolute epoch of the laser 
emission. Cgca) may be written as 


CEcal = {te _ te) Necal = Oocx of Opre (9) 


where tg is the emission epoch of Neca) events measured by the station calibration, 
docx is the free space delay propagation between the reference of the station and the 
optical input of the calibration station and dprg is the global internal propagation 
inside the calibration station. By using the same calibration station to calibrate 
the different laser stations, the delays dp:, in Eq. (9) do not need to be known 
accurately. 

For the ground to ground time transfer in common view mode (distance < 
6000 km), the space oscillator is only required over the time interval between laser 
pulses (typically from 0.1s to a few seconds). Over longer period, the noise is com- 
mon for both stations and the difference becomes negligible. In noncommon view 
mode (distance > 6000km), the noise of the space oscillator has to be considered 
over the time interval corresponding to the time delay between the consecutives 
passes. In that case, the noise of the space oscillator becomes an important source 
of noise. 

A ground to ground time transfer A,p between two ground clocks A and B in 
common view mode is computed from the differences between the individual time 
transfers xag and xpg, individually acquired by the stations and corrected with a 
model Cos¢ illustrating the mid-term behavior of the space oscillator. 


3.5. Error budget 


Tables 5 and 6 summarize the main uncertainties of a typical laser station and those 
of the space instrument, respectively. 
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Table 5. Main uncertainties/performances of a typical laser station suited for the T2L2 
project. 


Subsystem Characteristics u (ps) Comments 

Start detector InGaAs pin photodiode, 5 k=1 
Bandwidth = 3 GHz 

Return detector Avalanche photodiode, 50 k=1 
Geiger-mode 

Nd:YAG Laser A = 532nm, 10Hz, FE = 50mJ, 25 FWHM variation 
FWHM = 100ps 

Event timer Two independent channels, 5 k= 1 
resolution 1 ps 

Pulse per second (PPS) Numerical division of a frequency 5 ‘all 

generator source 

TF lab to LS cable LE < 100m, Thermal < +5° C, 30 Relative delay 
<0.1ps-K7!-m7} 

Ref. corner cube Simple corner cube, 4 Absolute location 


Localization +1 mm 


Table 6. Main uncertainties/performances of the T2L2 space instrument. 


Subsystem Characteristics u (ps) Comments 

Board detector Si avalanche photodiode (Geiger), 70 k = 1, (at min. Energy) 
Bandwidth = 1 GHz 

Event timer Resolution 1 ps, dead time 200 us 2.8 k= 1 

Oscillator DORIS Quartz USO 20 ox at 100s 

LRA Nine Suprasil corner cubes 32mm 13 Laser signature 


Pyramidal 50° 


Considering that the terms of Eq. (5) are independent, the combined uncertainty 
of a time transfer u. (Aas) between a ground clock A and the space clock is com- 
puted from the quadratic sum of the uncertainty of each term. The combined uncer- 
tainty of a ground to ground time transfer in common view mode is therefore the 
quadratic sum of uncertainties of ground to space time transfers u-(Aas), ue(Aps) 
and the uncertainty of the onboard model u.(Cosc). A detailed analysis of the 
whole experimental setup allows for determining an uncertainty budget for each 
term. Table 7 gives a summary of these combined uncertainties u, for a set of data 
corresponding to a complete acquisition of a satellite pass over a given laser station. 
Table 8 is the ground to space and ground to ground time transfer overall T2L2 
uncertainties. 

In the common view mode, because the space oscillator is only required over the 
time interval between laser pulses coming from each laser station (at a maximum 
of few seconds), the uncertainty u(Cosc) can be neglected as compared to the other 
noises. 

The highest uncertainty terms used in the computation of the error budget 
are coming from the delay variation of the cable between the time and frequency 
laboratory and the event timer of the laser station, and also from the laser pulse 


Clock comparison based on laser ranging technologies 1-351 


Table 7. Combined uncertainty for a typical laser station and for a complete 
pass acquisition. 


Uncertainty source u (ps) Comments 
Emission epoch uc (tz) 34 Laser station 

Reception epoch uc (tz) 17 Laser station 

Onboard epoch uc (tB) 16 Laser and space instrument 
Internal calibration we (Ccrcal) 21 Laser station 

External calibration uc (CcEca1) 36 Laser and calibration station 


Atmospheric ue (Catm) 
Sagnac uc (Cgag) 
Relativity 


Orbitography 
Orbitography 


RRR 


Table 8. Ground to space and ground to ground uncertainties for a complete satellite 
pass for the whole project. 


Time transfer u (ps) Comments 


Ground to Space expanded uncertainty ug(Ags) 98 Coverage factor = 2 
Ground to Ground expanded uncertainty ug(Acc) 138 Coverage factor = 2 


width variations. The first contribution can be reduced by monitoring the delay 
propagation measured by an event timer through a double propagation of the signal 
emitted by a distributor and repeated by the user. With such monitoring and with 
a laser having a pulse width uncertainty of 10 ps, the ground to ground expanded 
uncertainty becomes better than 100 ps. 


3.6. Link budget 


The energy received at the space segment is deduced from the study of the uplink. 
It depends on the atmospheric transmission Tatm and on the distance R between 
the satellite and the laser station. The energy received on the ground is deduced 
from both the uplink and downlink and also depends on the characteristics of the 
corner cubes embedded on the satellite and the characteristics of the telescope. 
The energy distribution at the output of the telescope can be usually considered 
as uniform over the whole aperture of the telescope 2-r7q_). At a distance R > rei, 
if the beam is diffraction limited, the distribution of such a beam can be approx- 
imated by the distribution of a Gaussian beam having a beam waist wo = rtel. 
The atmospheric turbulence modifies significantly the propagation of the beam. 
It introduces a beam spreading and also creates a speckle pattern in the plane 
of the satellite by interference. Because the atmosphere changes in time and that 
the satellite moves along its orbit, this pattern evolves very rapidly and introduces 
some important variation of energy from one shot to another. To take into account 
the beam spreading generated by the atmosphere, we must consider an equivalent 
aperture given by rg, where ro is the Fried parameter which depends on the atmo- 
spheric condition. The energy flux Dgo (time-averaged) at the center of the beam 
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Fig. 14. Energy density and space segment distance as a function of the incident angle p. 


at the distance R can be approximated by 


2TatmMrelT re 

R(p)?r? 
where Eas is the laser energy by pulse, 77) is the transmission of the telescope, 
is the wavelength, Tatm is the transmission of the atmosphere and p is the incident 
angle between the laser beam and the axis Earth-satellite. To illustrate the speckle 
pattern, the maximum and minimum energy densities (in the center of the speckles) 
have to be considered respectively two times greater than the mean density of 
the whole beam and close to zero. Figure 14 is an illustration of Eq. (10) as a 
function of the incident angle p of the beam as compared to the optical axis of the 
instrument, using the following data set: E,a, = 30mJ; ro = 24mm; Tatm = 0.81; 
As, = 1300m; Hs = 1330km; np.) = 0.44; A = 532nm; Rg = 6371 km. 

The energy received on the reception channel of the laser station E’pe) is given by 


Dgo = Etas (10) 


Tec(P) 
4R? 
where o¢, is the cross-section of the LRA. o-- depends on the incident angle p 
because of the geometry of the pyramidal supporting the individual corner cubes, 
and because of the speed aberration which introduces an angle between the laser 
beam and the real line of sight between the satellite and the laser station. The 
typical energy received with a 1.5m telescope on the ground is in the range of 0.4 fJ 

(roughly 1000 photons). 


Ete = Dro - -T atm + tel * hel, (11) 


3.7. Exploitation 


Two mission centers operate the exploitation activities of the T2L2 project. The 
first, operated by the CNES, is the Instrument Mission Center (IMC) which is 
in charge to gather all the raw data coming from the satellite and generate some 
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preanalyzed products. The second, under the responsibility of GeoAzur-OCA, is the 
Analysis Mission Center (AMC) which allows to generate the final ground to space 
and ground to ground time transfers from the preanalyzed product delivered by the 
IMC. The AMC operates continuously on a daily basis and is able to compute a 
given link three days after a given laser acquisition. 

When T2L2 is operated simultaneously from several laser stations, the arrival 
epochs of all laser events are mixed together. The first important step carried out 
by the AMC is to identify each event recorded with the corresponding laser station. 
This is done by comparing emission epochs of a given laser station with all epochs 
measured onboard. The next step is to reject the outliers coming from false detec- 
tions of both the space instrument and laser stations. Data are then corrected to 


t?? including geome- 


take into account the instrumental model of the space segmen 
try, photodetection time walk versus energy and time walk versus attitude. At this 
stage, the AMC generates some data files usable by the scientific community com- 
prising schematically emission and reception epochs together with arrival epochs 
onboard. From these files, it is then possible for everybody to compute the basic 
ground to space and the ground to ground time transfers. The AMC also computes 
all available ground to space time transfers, and opens the possibility of computing 
any ground to ground time transfer on demand. The AMC has developed a dedi- 
cated T2L2 website*? in order to share all these data and computations with the 
scientific community. 

Figure 15 is an example of what we get for a single satellite pass obtained with 
the MeO Laser Station linked to a H-Maser. 

The corresponding time stability computed from the root square of the time 
variance is illustrated in Fig. 16. 

The accumulation of several consecutive passes over the same station allows 
giving a mid-term time transfer comparison. Figure 17 is an example of such six 
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Fig. 15. OCA H-Maser Jason-2 quartz time transfer example acquired on November 2013. A 
linear regression has been subtracted to the time transfer to take into account the frequency offset 
between clocks. 
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Fig. 16. OCA-DORIS time deviation (black, left scale) and modified Allan deviation (red, right 
scale). Performances are in accordance with the DORIS USO. 
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Fig. 17. OCA-DORIS comparison over several consecutive passes. Up to seven passes on Jason-2 
can be acquired consecutively. 


consecutive passes obtained in spring 2010 with the MeO laser station linked to a H- 
Maser. After subtracting a quadratic regression to take into account the frequency 
offset between the clocks and the linear frequency drift, we get over all the passes 
u(Ameos) = 725 ps (u(Ameos) = 7100 ps with a linear fit). This result illustrates 
that a noncommon view time transfer must be done with at least a subtraction of 
a quadratic regression in order to remove the mid-term drift of the space oscillator. 
This can be achieved by using a reference laser station capable to track the satellite 
over several consecutive passes. In that case, a noncommon view ground to space 
time transfer Agg can be obtained with an expanded uncertainty of two times 
u(Aag) obtained with the quadratic fit: ug(Acs) = 1450ps (k = 2). A more 
sophisticated oscillator model taking into account several external parameters such 
as thermal changes or radiation is currently under study. 
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Several co-location time transfers have been made since the beginning of the 
T2L2 project between the two laser stations belonging to the OCA: MeO and the 
French transportable laser system (FTLRS).4+4+° Among them, one was conducted 
in 2010,*6 with an unique H-Maser connected to both MeO and FTLRS to validate 
the accuracy of the time transfer and another one was made with a high performance 
time distribution system (Sigma Time STX201) to validate the long-term time sta- 
bility. The system was able to monitor the absolute time delay variation introduced 
by coaxial cables used between laser stations and the time and frequency labora- 
tory. The first campaign allowed to validate the expected error budget (Table 8) 
from nine common satellite passes collected over four days. The time offset mea- 
sured between MeO and FTLRS was Ameo.rrirs = 37 ps (filtered at +3c) with 
an uncertainty u(Ameo-rrtrs) = 60 ps, (k = 1). The second campaign was based 
on the same setup except for the distribution of time signals between the time and 
frequency laboratory and the laser which was made with a continuous monitoring 
of the delay variation (measured by a dedicated PPS generator including an inter- 
nal event timer). Figure 18 illustrates the time stability TDev between MeO and 
FTLRS obtained during that campaign. This long-term time stability result is also 
in a good agreement with the error budget described in Sec. 3.5. 

Some other dedicated experiments were made to validate the time transfer 
between remote clocks in common view. A first experiment of that kind was per- 
formed in 2010 with atomic fountains between two observatories in France: the 
“Observatoire de Paris” (OP) and OCA.18 

The mobile laser station FTLRS was installed on a dedicated platform at OP. 
Some special authorizations were obtained to range with a laser from Paris. A 
mobile atomic fountain (FOM)** designed by OP-SYRTE in the frame of the ACES 
program, was installed at the OCA during the same period. At the OP, FTLRS, 
the atomic fountain and both GPS and TWSTFT were connected to the same 
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Fig. 18. Estimation of the T2L2 time stability measured between Meo and FTLRS in co-location. 
The time interval between consecutive acquisitions is not perfectly constant. 
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Fig. 19. Microwave and T2L2 time transfer comparison together with atomic fountain differences. 
A time offset between each plot is voluntarily introduced to facilitate the reading of the graph. 
Microwave time transfer solutions and fountain differences were computed by SYRTE (D. Rovera). 


H-Maser. At OCA, MeO, FOM and FTLRS were also connected to a common 
H-Maser. About 56 satellite passes were recorded over 114 days. 

Figure 19 shows both the time transfer comparison between T2L2—GPS (car- 
rier phase based)-TWSTFT (code) and the differences between atomic fountains. 
The GPS analysis is based on the PPP NRCan algorithm (carrier phase tech- 
nique) developed by Natural Resources Canada. This result doesn’t take into 
account calibrations between each of the techniques: An offset for each solution 
was introduced to facilitate graph reading. Regardless of these absolute aspects, the 
global drift in the microwave-T2L2 time comparisons is better than 2ns over two 
months, which is in accordance with the classical long-term stability of microwave 
time transfers. 

The phase of the atomic fountain comparison is computed from the frequency 
information between the fountain interrogation and the H-Maser frequency refer- 
ence. For some technical reasons, the atomic fountains did not operate continuously 
during the whole campaign. This has probably introduced a significant mid-term 
noise. 
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A second campaign between remote clocks in common view was performed 
by the end of 2013 between four distinct sites in Europe in order to realize 
a subnanosecond comparison between T2L2 and GPS CV.4748 The GPS CV 
comparisons are done from the differences made individually for all satellites 
in common view computed from an average over several minutes with some 
geometrical compensation and some atmospheric corrections. The campaign was 
jointly conducted in autumn 2013 by 


— GRSM 7845 OCA Grasse France. 

— FTLRS 7828 OP Paris France. 

— HERL 7840 SGF Hertsmonceux United Kingdom. 
— WETL 8834 BKG, FESG Wettzell Germany. 


For some calibration reasons, only the results between the three first sites were ana- 
lyzed. During the whole campaign, 28 common passes were obtained by the OCA 
and Herstmonceux, and 13 by the OCA and OP. Each station was calibrated based 
on a joint calibration of both laser and GPS. Laser stations were calibrated with the 
T2L2 calibration station while GPS receivers were calibrated using dedicated equip- 
ment moving between stations and conducted by OP-SYRTE.*? The uncertainty 
of the GPS CV time transfer was estimated at 2.1 ns for the link OCA—OP, 1.5ns 
for SGF-OP and 2.1ns for SGF-OCA (k = 2).°° The average differences between 
the calibrated links T2L2 and GPS were below 250 ps with a standard deviation 
below 500 ps. The very good agreement between GPS CV and T2L2 confirms that 
the uncertainty budget is consistent with the experimental setup. This is the first 
validation ever done which results are in agreement at sub-ns level between a GPS 
CV time transfer and another fully independent technique (T2L2) performed over 
long distances. This is strong evidence of T2L2’s ability to compare other time 
transfer techniques accurately. 


4. One-Way Lunar Laser Link on LRO Spacecraft 


LRO is a NASA’s mission that aims at exploring the Moon with numerous scientific 
objectives. LRO is orbiting around the Moon at an altitude of 50km. Among the on- 
board instruments, LOLA allows for providing a precise global topographic model 
of the Moon surface. LOLA emits a single laser pulse divided into five distinct 
beams. The beams are backscattered from the lunar surface and detected by the 
instrument. For each beam and for each laser pulse emitted, LOLA measures time 
of flight, pulse width and energy. The nominal accuracy of the instrument is 10cm. 
To realize this task LOLA includes two distinct emission—reception optical devices, 
a 1064nm 28 Hz-pulsed laser, a photodetection system and an event timer linked to 
an ultrastable oscillator. The mid-term time stability of the oscillator is oy = 107” 
over several thousand seconds. 

LOLA can also be used to realize a one-way laser ranging from the Earth to the 
Moon. One of the motives for equipping LRO with the one-way ranging functionality 
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Fig. 20. Earth to Moon one-way laser ranging with LRO through the LOLA of the spacecraft 
and some laser stations on ground. (For color version, see page I-CP6.) 


was the expectation that the orbit determination of LRO could be improved due to 
a better satellite clock model, leading to a better altimetry profiling of the Moon. 
The distance is then computed from the differences between the start and the arrival 
epochs of an ensemble of laser events. The principle of the measurement is depicted 
in Fig. 20. 

As compared to a passive two-way laser ranging configuration, the link budget 
of such a one-way link varies only with the inverse of the square of the distance R. 
The signal detected may be written as 


EX Are, 
Nye- = he @R? Pdet Patm Popt; (12) 


where EF is the energy emitted by the ground station, @ is the divergence of the laser 
beam, R is the distance between the Earth and LOLA, \ is the wavelength, pact is 
the quantum efficiency of the detector, patm is the atmospheric transmission, Popt 
is the transmission of the optics and ry.) is the radius of the detection telescope. 
The measurement is made with a dedicated ri~_) = 22 mm receiver telescope pointed 
toward the Earth®! and linked to the altimeter using an optical fiber (Fig. 21). 

The ground segment of this one-way laser ranging is represented by ten laser 
stations of the international laser ranging network. The primary ground station is 
NASA’s Next Generation Satellite Laser Ranging (NGSLR) station. LOLA receives 
the signal from the laser station through a wavelength multiplexer allowing to 
discriminate the altimeter pulses at 1064nm from laser ranging station pulses at 
532nm. The collimation telescope on the LRO spacecraft has an aperture of 22mm 
and a field of view wide enough to cover the whole Earth. It is mounted on the high 
gain RF antenna which is pointed to Earth. 

For each laser pulse emitted by a given laser station and measured by LOLA, 
one gets an emission epoch in the time scale of the laser station and a reception 
epoch in the LOLA time scale. The range is deduced from the differences of these 
epochs. Data are downloaded in real time (+308) with a classical microwave link 
and preprocessed in order to provide the observer prompt information on the status 
of the uplink. Figure 22 is an example of what the observer can get in real time. 

Since the commissioning of the LRO spacecraft, several thousand hours have 
been recorded in the frame of this project. The primary orbit determination of the 
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Fig. 21. RF antenna. The Laser ranging receiver telescope is on the left from the center of the 
main RF antenna (red ellipse). LOLA is coupled with the telescope through an optical fiber. 
(Courtesy: NASA Goddard Space Flight Center). (For color version, see page I-CP6.) 


LRO spacecraft is realized through microwave technology. Besides, one-way laser 
ranging is used to give independent orbit solutions. Some comparisons between 
ranges computed from the two techniques have shown an average total RMS differ- 
ence in the range of 10m. The optical link between laser ground station and LOLA 
has been tested for communication transfer°? at a rate of 300 bits/s. 

LOLA allows also performing ground to ground time transfers between laser 
stations in a common view mode.°? The major objective is to establish accurate 
ground station times and improve LRO orbit determination. The time transfer is 
computed from the onboard epochs recorded by LOLA and from the differences of 
the times-of-flight from each ground station to the spacecraft. Distances between 
laser stations being short compared to Earth—Moon distances, much higher uncer- 
tainties in times-of-flight from each SLR station to LRO than those required with 
T2L2 can be tolerated. As a consequence, the times-of-flight estimated from the 
conventional RF link are sufficient for time transfer at subnanosecond accuracy. 
For instance, if we consider a distance between laser stations equal to half the 
planet and a RF link uncertainty of 100m, the corresponding differential time of 
flight uncertainty is only 100 ps.°! The validation of that concept has been under- 
taken at Goddard Geophysical and Astronomical Observatory between Moblas-7 
and NGSLR laser stations. Outputs are currently being analyzed. 

It has been proposed to use the LOLA on LRO instrumentation in a differ- 
ential mode to realize a localization of the spacecraft also in the tangential plane 
(three-dimensional (3D) localization). Unfortunately, the short-term uncertainty of 
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Fig. 22. Acquisition from the McO station with the LRO-LOLA website. Each dot is an event (noise or laser pulse from laser station) detected by 
LOLA. The z-axis represents the arrival epoch of the detected event. The y-axis is the arrival epoch in a given temporal gate at the LOLA time scale. 
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the event timer onboard is not high enough (~ 1ns,k& = 1) to get an interesting 
outcome. 


5. Prospective 


Numerous missions have been proposed with a laser link. Some of them are based on 
a coherent laser link well suited for frequency transfer or measurements of position 
variation, and some other are based on the propagation of laser pulses most suited 
to meet the needs of absolute localization or time transfer. 

Several missions are proposed at the scale of the Solar System with distances in 
the range of several billion kilometers. Such distances cannot be measured through 
a classical passive two-way laser ranging scheme: The Earth—Moon distance is now 
considered as a maximum with a link budget in the ratio of 1/10°.54 To go further, 
it is necessary to use a one-way scheme where the link budget varies only with the 
inverse of the square of the distance Eq. (12). With a payload instrument based on 
optics having an aperture of 100mm, a beacon divergence of 5 arcsec, 300 mJ per 
pulse and a distance of 400 million km, we get: Ny.- ~ 1 electron. 

In the context of future space missions for fundamental physics, Solar System 
science and navigation, a laser link for clock comparisons, ranging and data trans- 
mission is of prime importance. Table 9 gives a nonexhaustive list of some future 
missions and associated laser links. 

ELT is a two-way pulsed laser link currently under study for the ACES mission. 
The onboard hardware consists of a retro-reflector, a single photon avalanche diode 
and an event timer connected to the ACES clocks. The ground segment is based 
on the international network of laser ranging stations. The ELT experiment should 
allow a space to ground clock comparison with time stability (TDev) of 4ps over 
300s and 6 ps for the ground to ground time transfer. Thanks to the very good time 
stability of the ACES clocks, the ground to ground comparison in a noncommon 
view configuration over one orbit period should be of the same order of magnitude. 
The time transfer accuracy should be better than 50 ps. Some time transfer com- 
parisons between ELT and T2L2 should be scheduled at the ACES mission start-up 


Table 9. Some future space missions and associated laser links. 


Mission Laser link type Laser link Funded Expected 
name launch 
ACES Two-way laser pulsed ELT Yes 2017 
GNSS Navigation Two-way laser pulsed OPTI Yes 2017 
Tiangong Two-way laser pulsed LTT Yes — 
LATOR 2 Xx one-way pulsed laser LATOR No — 
transponder 
STE-QUEST PRN modulation OPL No = 
ASTROD I 2x one-way pulsed laser ASTROD I No 
transponder 
SAGAS Two-way coherent-modulated DOLL No — 


OSS One-way laser pulsed/coherent TIPO/DOLL No — 
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Fig. 23. LTT equipment. On the left is the detector; in the middle lies the main electronic package 
including the event timer. (Courtesy: Shanghai Astronomical observatory). (For color version, see 
page I-CP6.) 


together with some comparisons with microwave link also embedded on ACES. The 
project should be launched on the International Space Station (ISS) in 2017. 

The OPTI two-way link aims at synchronizing any satellite navigation systems. 
It is designed as a compact device for real-time ground to space clock correc- 
tions, using the existing satellite laser ranging network. The instrument should 
demonstrate time transfer with an uncertainty of 100 ps. This instrument could be 
launched in 2017. 

The time transfer project proposed on the Chinese Tiangong station will be 
based upon a LTT instrument designed by SHAO. Figure 23 is photography of the 
space equipment. 

The Laser Astrometric Test of Relativity (LATOR) mission®®°® architecture is 
based on a light triangle formed by laser beacons between two spacecrafts placed 
in heliocentric orbits and a laser terminal on the ISS. LATOR uses both an optical 
interferometer and classical laser ranging techniques to accurately measure deflec- 
tion of light in the Solar System. Laser ranging would be done through a double 
one-way link scheme working in an asynchronous mode. The distance accuracy 
required for the final scientific objective is 3mm corresponding to a delay uncer- 
tainty measurement of 10 ps. 

The STE-QUEST optical link®”*® is based on an initial design of the TESAT 
laser communication terminal. Scientific objectives of the STE-QUEST mission 
require a common view comparison of clocks on ground at the 107'8 fractional 
frequency uncertainty level after a few hours of integration and a space to ground 
and a ground to ground time transfers with accuracy better than 50 ps. The link 
includes a space terminal with two optical telescopes allowing for simultaneous 
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bidirectional links and at least two ground stations equipped with an Optical Link 
Ground Terminal. The laser link is based on an optical carrier modulated with a 
pseudo-random noise signal. The optical carrier of the link is generated from the 
optical signal of the onboard atomic clock. The PRN modulation transmitted to 
the ground station is based on the microwave reference signal. The space optical 
terminal receives and cross-correlates the PRN optical carriers replicated by the 
ground from the onboard reference signal. The differential delays of the link have 
to be calibrated with an uncertainty better than 50 ps. 

The objectives of the Astrodynamical Space Test of Relativity using Optical 
Devices (ASTROD) mission include measurement of some relativistic parameters 
and measurement of several Solar System parameters. ASTROD is developed in 
two distinct missions ASTROD I® and ASTROD-GW.°? ASTROD I uses a single 
spacecraft carrying four lasers, a clock and some laser stations on ground. The 
distance is measured with a double asynchronous one-way laser ranging system with 
an uncertainty in the range of 1mm. ASTROD-GW is based on three spacecrafts 
with two spacecrafts in separate solar orbits and one near earth. In ASTROD-GW, 
each payload comprises a proof mass, two lasers, a clock and a drag-free system. 
The distances between each spacecraft are optically ranged coherently. 

The Search for Anomalous Gravitation using Atomic Sensors (SAGAS) mis- 
6! aims at flying sensitive atomic sensors and a laser link on a Solar System 
escape trajectory in 2020-2030. SAGAS has several science objectives in funda- 
mental physics and Solar System science. The payload comprises an optimized 
optical atomic clock, an absolute atomic accelerometer and a laser link (DOLL) for 
ranging, frequency comparison and communication. The optical link is based on a 
continuous stabilized laser locked to the optical clock and a coherent heterodyne 
detection system. The link is based on a 40cm telescope for the space segment 
and 1.5m on ground. Some preliminary projects were initiated in 2009-2012 by the 
SYRTE institute (Mini-DOLL) in order to define more precisely the design of that 
laser link.S*:63 The Allan deviation obtained on a fixed target through a turbulent 
atmosphere (5 km) was o, (lms) = 28nm and o; (1s) = 1.4m. 

The aims of the Outer Solar System (OSS) mission®* are shared by planetary 
science and fundamental physics. The mission uses a single spacecraft in the Solar 


sion 


System equipped with several specific instruments. In particular, it comprises some 
instruments (optic and microwave) allowing a precise tracking of payload during 
the cruise for the measurement of the Eddington parameter y. Two solutions are 
envisioned for the measurement in the optical domain, with a one-way pulsed laser 
system TIPO,® or a two-way coherent laser concept DOLL identical to the instru- 
ment used in the SAGA mission. In the TIPO concept, the distance is computed 
from the measurement of the time-of-flight of laser pulses emitted from an Earth 
laser station and received by the space vehicle carrying a clock, a time tagging unit 
and a photodetection system. The time-of-flight is deduced from the differences 
between start and arrival times measured at the respective time scales of the clock 
on ground and the clock in space. The TIPO instrument is made up of a telescope, 
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a photodetection device and an event timer linked to an ultrastable clock. At the 
lowest level, the propagation delay and distance between Earth and spacecraft are 
deduced from the differences between the arrival and departure times. The behav- 
ior of the clocks is a major factor in the performances of the experiment. With a 
compact rubidium clock,®® having a time stability of og = 2.5- 107! - 7'/? over 
a few thousands of seconds, some interesting measurements like planetary gravity 
fields or Shapiro signatures can be done. With that kind of rubidium clock, the 
instrument is able to measure short-term distance variations with an uncertainty 
of the centimeter level over an integration time of one day. 


6. Conclusion and Outlook 


Modern laser ranging is today a mature technology able to routinely produce range 
measurements with a subcentimeter uncertainty. There is a widespread network of 
laser ranging stations on ground capable to work together in order to meet the clas- 
sical laser ranging activities but also some specific issues such as time transfer. The 
first proposal for using the SLR technology in order to realize a time transfer was 
submitted in the early seventies by the LASSO project. Since that date, many other 
projects have been led in Earth’s orbit as well as at the Solar System scale. The best 
operational laser time transfer available today is T2L2 on Jason-2. The project has 
been in successful operation since summer 2008 and extended until 2016. Since 2008, 
several dedicated campaigns have been carried out to demonstrate the performance 
of that time transfer technology and realize several scientific objectives. T2L2 has 
proved its ability to synchronize remote ground clocks with an uncertainty better 
than 100 ps (k = 2). It has routinely established some link from ground to space 
with a time stability of a few picoseconds over several hundred seconds. Three laser 
time transfer projects, developed in the frame of the Compass system, have been 
able to realize a ground to space time scale synchronization with a data spread of 
a few hundred picoseconds. Laser systems have been also used to establish links in 
deep space and are able to achieve better performance at lower power with some 
small apertures as compared to classical microwave systems. Laser ranging tech- 
nology has demonstrated its capability to realize some one-way laser links beyond 
the Earth’s orbit with the operational mission LRO and two impressive demon- 
strations in the Solar System at 24 and 80 million km. The next high performance 
LTT will be supported through the ACES mission with the ELT instrument. Some 
major projects are envisioned with both T2L2 and ELT to realize picosecond time 
transfers in noncommon view configuration and to compare microwaves techniques 
with laser technologies. In the context of fundamental physics, Solar System science 
or navigation, several other challenging missions, pending for approval, suggest to 
work at the scale of the Solar System with distances in the range of hundreds of 
millions of kilometers, such as ASTROD in 2025. 

Direct comparison of clocks over short distances can be made through the use of 
subpicosecond event timers. In that case, the event timer measures the time interval 
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between either the PPS signal or the frequency reference of each clock. Some event 
timers today are able to obtain a fractional frequency stability of 10~'° over an 
integration period of only a few hundred seconds, or of 10~1® over one day.®” In 
cases when the distances between clocks are greater than some meters, event timers 
can also monitor the propagation delay in the coaxial cables to subtract the possible 
thermal drift. This gives also the possibility to measure the shape of the signals used 
to synchronize the clocks (PPS) with two event timers running in an oscilloscope 
mode. One event timer is set as the trigger for the time reference and the other is 
used to measure the signal for several thresholds. The x-axis is the time difference 
between the two event timers, and the y-axis is the threshold of the signal. This 
concept is of crucial importance to compare different time transfer techniques. 

Recent advances in optical fiber technology are now allowing the realization 
of precise frequency and time transfer through the optical fiber network infras- 
tructure used for worldwide communication.®* ©? Preliminary experiments done in 
several countries such as France, Germany or the USA have shown the possibil- 
ity to realize some links over distances of up to several hundred kilometers. Some 
demonstrations have been made over several hundred kilometers with a fractional 
frequency fluctuation of only a few 10~!% over one day. These performances are 
well suited to distribute the signal generated by the best optical clocks over con- 
tinental distances over the next ten years. It has been also demonstrated that the 
Internet traffic could be maintained during the frequency distribution without any 
notable degradation of the performance. Time transfer accuracy better than 100 ps 
is likely to be achievable by using some specific modulation of the optical carrier. 
The simplest way to do such time transfers through optical fibers is to use optical 
pulses and an asynchronous transponder based on event timers at both ends to 
measure the two-way propagation. Time transfer can also be performed by using 
some modulated codes added on the carrier and some specific modems in order to 
extract a usable synchronization signal. Today, the deployment of these techniques 
through the infrastructure currently used for Internet communication requires the 
installation of some specific equipment to meet metrology requirements. We can 
expect that further progress in the telecommunication technology will enable its 
development as applied to time and frequency transfer. As well, optical communi- 
cation in free space turns out to be a very promising technique for very high speed 
data rate communication. Several proposals are currently under review.”? We can 
expect that the development of such free space optical communications will be used 
in the future for time and frequency transfer. T2L2 and time transfer by optical 
fiber are complementary techniques that are both extremely promising, especially 
if developed jointly. 

The precise and accurate determination of distances is a critical issue in many 
fields. Numerous formation flight space missions require significant improvement 
in the distance measurement between spacecrafts. It is now a general tendency 
to increase the number of space vehicles to realize some very huge space detectors 
through a high precision laser metrology. Recent researches in the field of ultrastable 
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event timers have allowed to obtain some repeatability error in the subpicosecond 
domain and some time stabilities of only a few femtoseconds over some hundred 
seconds.”! 73 Such femtosecond time stabilities converted into light distance allow 
for resolving the wavelength of the light. The association of the time-of-flight with an 
interferometric measurement gives the possibility to obtain length measurements 
with accuracy much better than a single wavelength. Combined with an optical 
interferometer, these femtosecond measurements could allow to combine the very 
high resolution given by the interferometer together with the absolute measurement 
of the time-of-flight measured by the event timer. Several experimental variations 
of that concept were proposed in the framework of the ILIADE project.“ One of 
these concepts was based on a femtosecond laser frequency comb associated with 
some interferometric Fabry—Perot filters, a high speed optical modulator and some 
event timers. The Fabry-Perot filters were used to isolate a single line from the 
comb to generate two continuous carriers (respectively on the laser output and 
the return path of the beam) which were mixed together in order to generate the 
interferometric measurement of the system. The high speed modulator was used 
to subtract some pulses from the continuous laser train to create a low frequency 
coded modulation. A high speed detection linked to the event timers was used to 
detect this modulation and to compute the absolute distance. 
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In 1859, Le Verrier discovered the Mercury perihelion advance anomaly. This anomaly 
turned out to be the first relativistic gravity effect observed. During the 157 years to 
2016, the precisions and accuracies of laboratory and space experiments, and of astro- 
physical and cosmological observations on relativistic gravity have been improved by 
3-4 orders of magnitude. The improvements have been mainly from optical observations 
at first followed by radio observations. The achievements for the past 50 years are from 
radio Doppler tracking and radio ranging together with Lunar Laser Ranging (LLR). At 
present, the radio observations and LLR experiments are similar in the accuracy of test- 
ing relativistic gravity. We review and summarize the present status of solar system tests 
of relativistic gravity. With planetary laser ranging, spacecraft laser ranging and interfer- 
ometric laser ranging (laser Doppler ranging) together with the development of drag-free 
technology, the optical observations will improve the accuracies by another 3-4 orders 
of magnitude in both the equivalence principle tests and solar system dynamics tests 
of relativistic gravity. Clock tests and atomic interferometry tests of relativistic gravity 
will reach an ever-increasing precision. These will give crucial clues in both experimental 
and theoretical aspects of gravity, and may lead to answers to some profound issues in 
gravity and cosmology. 


Keywords: General relativity; experimental tests of relativistic gravity; solar system 
dynamics; ephemerides. 
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1. Introduction and Summary 


The development of gravity theory stems from the experiments. Newton’s theory of 
gravity! is empirically based on Kepler’s laws? (which are based on Brahe’s obser- 
vations) and Galileo’s law of free-falls? (which is based on Galileo’s experiment of 
motions on inclined planes). Towards the middle of the 19th century, astronomi- 
cal observations accumulated a precision which enabled Le Verrier* to discover the 
Mercury perihelion advance anomaly in 1859. This anomaly is the first relativistic 
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gravity effect observed. Michelson—Morley experiment,° via various developments,® 
prompted the final establishment of the special relativity theory in 1905.” Motiva- 
tion for putting electromagnetism and gravity into the same theoretical framework,’ 
the precision of Edétvés experiment? on the equivalence, the formulation of Einstein 
Equivalence Principle (EEP)!° together with the perihelion advance anomaly led 
to the road for the final genesis of General Relativity (GR) theory!) ?° in 1915. 
As we discussed in Refs. 16 and 17, Einstein proposed the mass-energy equiva- 
lence using the formula EF = mc? in 190518; Planck reasoned that all energy must 
gravitate in 1907.'° To characterize the strength of a gravitational source, it would 
then be natural to compare magnitude of the gravitational energy mU(a,t) of a 
test particle in the gravitational potential U(x, t) of a gravitating source to the total 
mass-energy mc? of the test particle and define this ratio €(a,t) as the dimension- 
less gravitational strength of the source at a spacetime point go [with coordinates 


(x, t)}: 
U(a, t) 
3 (1) 
GR gives strong-field corrections to the Newtonian gravity. The first-order correc- 
tion is proportional to this strength €(a, t). 

For a point source with mass M in Newtonian gravity, 


_ GM 


E(x, t) Re’ (2) 


(x,t) = 


Cc 


where R is the distance to the source. For a nearly Newtonian system, we can use 
Newtonian potential for U. The strength of gravity for various configurations is 
tabulated in Table 1. 

From Table 1, it is clear that in the solar system, Mercury has the largest solar 
system gravitational potential among all planets and satellites, and hence the largest 
general-relativistic solar system gravitational correction. This is why the general- 
relativistic deviation of the Mercury orbit from Newtonian theory — the Mercury 
perihelion advance anomaly of about 40” per century was first observed. When the 
observations reached an accuracy of the order of 1” per century (transit observa- 


Table 1. The strength of gravity for various configurations. 


Source Field position Strength of gravity € 
Sun Solar surface 2.1 x 10-6 
Sun Mercury orbit 2.5 x 10-8 
Sun Earth orbit 1.0 x 10-8 
Sun Jupiter orbit 1.9 x 10-9 
Earth Earth surface 0.7 x 10-9 
Earth Moon’s orbit 1.2x 107-1 
Galaxy Solar system 10-5106 


Significant part of observed universe Our galaxy 1-10~? 
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tions) in the 19th century, a discrepancy from Newtonian gravity would be seen. In 
a century, Mercury orbits around the Sun 400 times, amounting to a total angle of 
5 x 10% arc sec. The fractional relativistic correction (perihelion advance anomaly) 
of Mercury’s orbit is of order AGMgun/dc” with \ = 3 for GR (i.e. about 8 x 1078) 
and d the distance of Mercury to the Sun. Therefore, the relativistic correction for 
perihelion advance is about 40 arc sec per century. As the orbit determination of 
Mercury reached an accuracy better than 107° (about larce sec for solar transit 
observations in 100 years), the relativistic corrections to Newtonian gravity became 
manifest. Le Verrier discovered this perihelion advance anomaly (anomaly to New- 
ton’s theory) and measured it to be 38 arc sec per century.* In 1881, Newcomb 
obtained a more precise value (43 arcsecond per century) of Mercury perihelion 
advance anomaly.?° 

In 1907, Einstein proposed his equivalence principle and derived the gravita- 
tional redshift!®; in 1911, Einstein derived the light deflection in the solar gravi- 
tational field.24:* In 1913, Besso and Einstein?’ worked out a Mercury perihelion 
advance formula in the “Einstein-Grossmann Entwert” theory,?® but the calcula- 
tion contained an error and did not agree with the experimental value. During the 
final genesis of GR,’ Einstein!? corrected their 1913 error and obtained a Mer- 
cury perihelion advance value in agreement with the observation.?° Apparently, this 
correct calculation played a significant role in the final genesis of GR. 

Gravitational redshift, gravitational light deflection and relativistic perihelion 
advance are called three classical tests of GR in Einstein’s “The Foundation of the 
General Theory of Relativity.” 9 

Towards the end of the 19th century, there were studies whether the solar spec- 


30,31 Various causes (such as pressure 


tra were displaced from the Doppler spectra. 
effect, pole effect, asymmetrical broadening) were found and investigated before 
1910.2? % In 1911, Einstein?! noticed the work of Buisson and Fabry,** and explic- 
itly proposed that gravitational redshift might be tested by the examination of the 
solar spectra. From 1914 to 1919, re-analysis of previous solar spectra together with 
a number of new measurements were made. However, the outcome is controversial 
and inconclusive. Earman and Glymour®® gave a detailed account of this history. 
Before Einstein’s proposal of relativistic solar deflection of light in 1911, there 
were photographs taken for studying the solar corona and to find a sub-Mercurial 
planet of solar neighborhood during total eclipses. These photographs were con- 
sidered unsatisfactory to study the deflection of light by Perrine upon a question 


4Newton in his Opticks2? of 1704 proposed the following query for further research: “Do not Bodies 
act upon Light at a distance, and by their action bend its Rays, and is not this action strongest at 
the least distance?” In 1801, Soldner?® derived the gravitational bending of light from corpuscular 
nature of light and Newton’s universal gravitation. Soldner?>> calculated the deflection angle for 
light grazing the Sun to be 0.84 arcsec remarkably close to Einstein’s 1911 value of 0.83 arcsec.?1 
Cavendish’s work on the gravitational bending of light (probably around 17842°) was published 
posthumously in 1921.26 

bFor an English translation and a historical discussion of Soldner’s paper [23]. See Ref. 24. 
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from Freundlich late 1911 either because of small field and brief exposure time, or 
because of eccentric position of the Sun on the plates (see e.g. p. 61 of Ref. 37). 
Before 1919, there were four expeditions intent to measure the gravitational deflec- 
tion of starlight (in 1912, 1914, 1916 and 1918); because of bad weather or war, the 
first three expeditions failed to obtain any results, the results of 1918 expedition 
was never published.” In 1919, the observation of gravitational deflection of light 
passing near the Sun during a solar eclipse®® confirmed the relativistic deflection of 
light and made GR famous and popular. 

The success of Pound and Rebka®® in using Méssbauer effect to verify the grav- 
itational redshift in earth-bound laboratory in 1960 marked the beginning of a new 
era for testing relativistic gravity. At the same time, a careful and more precise test 
of the equivalence principle was performed in Princeton.*° With the development 
of technology and advent of space era, Shapiro*! proposed a fourth test — the time 
delay of radar echoes in gravitational field. Since the beginning of this era, we have 
seen 3-4 orders of improvements for the three classical tests together with many new 
tests. The current technological development is ripe that we are now in a position 
to discern another 3—4 orders of improvements further in testing relativistic gravity 
in the coming 25 years (2016-2040). This will enable us to test the second-order 
relativistic gravity effects. A road map of experimental progress in gravity together 
with its theoretical implication has been shown in Table 2 of Ref. 42. 

The present review updates the solar system test part of a previous review on 
“Empirical Foundations of the Relativistic Gravity”** (which is a five-year update 
of the 1999-2000 review*). A companion review on equivalence principles and the 
foundation of metric theories of gravity has been already given in Ref. 17. Recently 
Manchester“ has reviewed the pulsar tests of relativistic gravity. A previous review 
on the solar system tests of relativistic gravity is from Reynaud and Jaekel.4° A 
good general review on experimental tests of GR. is from Will.*6 

In Sec. 2, we review the post-Newtonian approximation of GR, the Parametrized 
Post-Newtonian (PPN) framework, and derive the Shapiro time delay and the first- 
order relativistic light deflection as examples. In Sec. 3, we review and discuss the 
solar system ephemerides. In Sec. 4, we update the solar system tests since our 
last review in 2005. In Sec. 5, we discuss ongoing and next generation solar system 
experiments related to testing relativistic gravity with an outlook. 


2. Post-Newtonian Approximation, PPN Framework, Shapiro 
Time Delay and Light Deflection 


The equations of motion of GR, i.e. the Einstein equation is 
Gu = KT yw; (3) 


where 7}, is the stress-energy tensor and k = 87G/c* (see e.g. Ref. 47). We use the 
MTW“ conventions with signature —2; This is also the conventions used in Refs. 16, 
17; Greek indices run from 0 to 3; Latin indices run from 1 to 3; the cosmological 
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constant is negligible for solar system dynamics and solar system ephemeris, and is 
neglected in this review. Contracting the equations of motion (3), we have 


R=- (=) ely (4) 


where T = T/'. Substituting (4) into (3), we obtain the following equivalent equa- 
tions of motion as originally proposed by Einstein! 


no (%) fm (Bom 6 


For weak field in the quasi-Minkowskian coordinates, we express the metric 
Jags as 


Jos = Nag + hap, hap <i. (6) 


Since hag is a small quantity (< 4x 10~°) in the solar system gravitational field, we 
expand everything in hag and linearize the results to obtain the linear (weak-field) 
approximation. With the harmonic gauge (coordinate) condition for hag, 


haa : (5) aat'T i) POR Se teete (5) (Trh).a + O(h), (7) 


the linearized Einstein equation is: 


tana? == (ES) |e — (5) (mer] +000), 8) 


where Tr(h) is defined as the trace of hy”, ie. Tr(h) = ha®, and O(h?) denotes 
terms of order of hagh,,» or smaller (see e.g. Refs. 47 and 48). Analogous to classical 
electrodynamics, the solution of this equation for GR is 


we fg [EON erro 0 


retarded 


2.1. Post-Newtonian approximation 


For solar dynamics and solar system ephemerides, we can impose slow motion con- 
dition, in addition to weak field condition, i.e. 


U v? Use 1 v? U,0; 1 vp? 
z-0(3): S=(z)0(3): B= (z)0(3): 

U,00 1 v4 

z =(q)0(S). (10) 
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where L is a typical length scale and v is a typical velocity of the system 
(see e.g. Ref. 47). The solution h,, of (9) and kh. in this approximation then 
becomes 


U ye U ve 
Duy = 2(5) by 0 (4): heh =4(5)+0(), (11) 


where U is the Newtonian potential which normally contains multipole terms out- 
side a gravitating body. For point mass or outside spherical Sun, 


Px (S) @ f=(P Peery, (12) 


r 


With the metric (11), one can already derive the solar deflection of light and the 
Shapiro time delay. For a derivation of relativistic precession of Mercury’s orbit, 
one needs a full post-Newtonian approximation of GR and needs to calculate hoo to 
O(v4/c*) order and ho; to O(v?/c?) order. The post-Newtonian approximation for 
perfect fluid in GR is obtained by Chandrasekhar.*® The metric gag(= Nag + hag) 
is given by 


p> 
goo = 1-25 42-5 pv +0 (4). 


(5) W; +0 (5). (13) 


where viet) = f 2] ae - 
Heda & fr |e (15 
several 

cen et . 
nicey = 5 f [Mute 2) wee) we, 


with po(x,t) the rest mass density, v(a, t)[= (V1, v2, v3)| the 3-velocity, U(a,t) the 
Newtonian potential, [I(a,t) the internal energy and p(a,t) the pressure of the 
fluid. 
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2.2. PPN framework 


For different theories of relativistic gravity, the post-Newtonian metrics are differ- 
ent. However, the post-Newtonian metric of many relativistic gravity theories can 
be encompassed in the PPN framework with nine post-Newtonian parameters /3, 
Bi, Ba, G3, Bas, ¢, M1 and Ap: 


2 5 
goo =1 25 28 AW + Cf 4 0(3). 
90: = (3) Ba Vit (5) A2W; +0 (=). (18) 
U 4 
95 = (1 215 ) by 0 (4). 
where 
1 po(x’, t)y(a’, t) / 
W(a,t) = a / | al | dz, (19) 
wb = Byv? + BU 4 (5) oat (3) a2, (20) 
_1 ffoole'Ole-e') vee)? Y 
A(x,t) = af oak aw’ (21) 


Each gravity theory has a specific set of values for these PPN parameters if 
it can be encompassed in the framework. GR has the PPN parameters @ = y = 
1,01 = Bo = 63 = Ba = Ay = Ag 1, and ¢ = 0. Brans—Dicke—Jordan theory 
has the PPN parameters @ = 1,y = (1+ w)/(2+w), G1 = (3 + 2w)/(4 + 2w), 
Bq = (1+2w)/(44+2w), 63 = 1, Bs = (1+w)/(2+w), € = 0, Ai = (104+-7w)/(144+7w) 
and A», = 1 with w the Brans—Dicke parameter. Brans—Dicke—Jordan theory is a 
scalar—tensor theory. For general scalar—tensor theories without mass terms, their 
PPN parameters are G = 1+ A, y= (1+w)/(24+w), G1 = (84+ 2w)/(44 2w), Go = 
(1+2w)/(4+2w)—A, 63 = 1, 64 = (1+w)/(2+w), ¢ =0, Ai = (10+ 7w)/(144 7w) 
and A> = 1 with A a second parameter in addition to w. Both GR and scalar—tensor 
theories are conservative and nonpreferred-frame theories. For them, it would be 


more convenient to re-define the following linear combinations of parameters as the 
55,56. 


new PPN parameters 


n= TA, t Ao Ay 4, 


a2 =Ae+¢-1, 
a3 = 40, 27 2 ee 
ci = ¢; 
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(2 = 28+ 20) — 37-1, 
¢3 = 63 —1, 


Ga = Ga — 7. 
(22) 


In terms of nine parameters 3,7, @1, Q2, a3, C1, 2, C3, and G4, the only param- 
eters which are not vanishing for GR and for scalar—tensor theories are the two 
parameters 3 and y. Indeed, Will and Nordtvedt®?°® showed that a theory of grav- 
ity which can be encompassed in the PPN framework at the post-Newtonian level 
and which possesses all 10 global conservation laws (four for energy-momentum 
and six for angular momentum) if and only if 


G1 =¢2 = (3 = Ga = 03 = 0. (23) 


53,56,46. 
? 


Q1, @2, and a3 measure the extent and nature of preferred-frame effects any 
gravity theory with at least one of ais nonzero is called a preferred-frame theory. In 
the PPN framework (18), conservative nonpreferred-frame theories can have only 
two independent parameters 9 and y. General scalar—tensor theories without mass 
term span the whole class of conservative theories fitted in the PPN framework. 

Empirically, the preferred-frame and nonconservative parameters a1, Q2, a3, C1, 
Cg, and ¢3 are constrained as follows: 


ai| < 3.4 x 1079(limit®” from the orbit dynamics of the binary pulsar 
PSR J1738 + 0333°8), 

ag| < 1.6 x 10-9 (limit from millisecond pulsars PSR. B1937 +21 

and PSR J1744—1134°), 


a3| < 4.0 x 10~7°(limit from the orbital dynamics of the statistical 
60) 


combination of a set of binary pulsars 

|G] < 1.5 x 10-3(limit calculated from the constraints on the Nordtvedt 
parameter [= 43 — y — 3 — a1 + (2/3)a2g — (2/3), — (1/3) G3] 
and other parameters (Table 2)), 

\Co| < 4 x 10-°(limit from binary pulsar PSR 1913+16 acceleration®?), 


\¢3| < 1.5 x 10~3(limit from confirmation of Newton’s third law by 


lunar acceleration®? ®), 


As to C4, according to Will,4® there is a theoretical relation 6¢4 = 3a3 +2C, — 
3¢3 for gravity theories whose perfect-fluid equations are blind to different forms of 
internal energy and pressure in the fluid so that ¢, becomes redundant. 

Although PPN framework (18) encompasses a large class of gravity theories, 
there are still many gravity theories outside its scope. One notable example is 
Whitehead theory as completed by imposing the EEP. Its post-Newtonian metric 
contains additional terms which has to be parametrized by an additional parameter 
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€ (or &w) called Whitehead parameter. These additional terms with parameter € 
can be included in an extended PPN framework.°?°® For Whitehead theory, € = 1 
(by definition). Solar system tests and constraints on the Whitehead terms have 
been studied in Refs. 66-69 with |£| constrained to order of 107%. The constraint 
from millisecond pulsars” gives |f| < 3.9 x 10~°. Also, the PPN framework (18) 
does not contain the intermediate-range gravity terms (Yukawa terms). These terms 
can be included in a separate treatment. Misner et al.4” have treated the case with 
the anisotropic stresses. For this case, there is a post-Newtonian term in goo with 
an extra parameter in addition to parameter (4. However, the anisotropic stresses 
are much smaller than the isotropic stresses or pressures in the solar system. For 
solar system dynamics consideration, they are negligible up to now. 

Historically, Eddington” first used the parametrization of metric for discussing 
the classical tests of relativistic gravity based on the isotropic post-Newtonian 
expansion of the Schwarzschild metric with the following line element: 


gra 
ds? = c 2a () 28 (S) | dt? 
i 1 i 


c + Dry (“) hens | (dr? + r7d6? + r? sin? 6 dy”), (25) 
- 


where a, (3, y are called the Eddington parameters. For metric theories, EEP is 
assumed already. The Eddington parameters should not depend on the mass-energy 
contents. To have the correct Newtonian limit, a can be absorbed into a re-definition 
of Newtonian gravitational constant G (i.e. aG — G). Hence as a parametrized 
post-Newtonian framework, there are only two effective Eddington parameters (3 
and y. In 1968, Nordtvedt’* developed the first modern version of PPN framework 
for a system of two gravitating point masses with later generalization to more 
particles. It contains seven parameters in addition to a. In 1971, Will®? 
the PPN framework to perfect fluid with two additional parameters G3 and /34 (or 
¢3 and ¢4 in the new combination of parameters) for terms on internal energy and 
pressure. This framework does not contain the parameter a. As we noticed in the 
comment after (24), ¢4 (or 64) would be a redundant parameter if EEP is assumed. 
So is ¢3 (or 3). They are really parameters for testing EEP. As we discussed at the 
beginning of this paragraph, @ parameter tests EEP too. In a metric framework 
like PPN framework (18), seven parameters can be explored. This means the 1968- 
Nordtvedt and 1971-Will framework are effectively equivalent. 

Moreover, the preferred-frame parameters and the conservation-law parameters 
Q1, Q2, G4, C2, and ag are essentially test parameters for Strong Equivalence Prin- 
ciples (SEP). They are constrained to observe SEP quite well [see e.g. (24)]. In rest 
of this paper, we concentrate mainly on the experiments to test the two parameters 
(6 and ¥. In the next two sections, we illustrate the PPN effects with two simple 
calculations of Shapiro time delay and the gravitational deflection of light passing 
near the Sun. 


extended 
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2.3. Shapiro time delay 


One can derive the light propagation equation in the weak field limit for the physical 
metric (6). Let r = r(t) be the light trajectory where r(t) = (x(t), y(t), z(t)) isa 
3-vector. Light propagation follows null geodesics of the metric gag; its trajectory 
r(t) satisfies 

0 = ds? = gagdx*dx? = (1+ hoo)c*dt® + 2hoscdx'dt + (m; + hij)dax'dx?, (26) 


using (6). 
In the Minkowski approximation, the light trajectory can be approximated by 


i (0)é 
d dx’ . . 
— ( zs ) + O(h) = en! + 0(h), with Sn)? =1, (27) 


dt dt 


where n* are constants. In the Post-Minkowski approximation, we express dz’ /dt 
as 

dx (0)i (1)i 2 
“En” + on’ + O(h*), (28) 
where n“)* is a function of trajectory and of the order of O(h). Substituting (28) 
into (26) and dividing by dt?, we have 


dx’ dr |* dx’ dx! 
t=( 00)e ve (4) Ti j ( =| (S ) (29a) 
= (1+ Aoo)c? + 2hosc(en* + en(M*) — Soni n'y 
i=l 
+ han 'nOFe? + O(h?). (29b) 
Simplifying (29b), we have 
3 
; ; 1 
So nin = (5) (hoo + 2hoin* ae hyn Ons) a O(h?), (30) 
i=l 


and solving for |dr/dt| in (29a), we obtain the light propagation equation to O(h) 
order: 
dr. 


Fe | = cl + hoo + 2hoin* + hign nl?) + O(R?)]? 


1 ; 1 dake 
=¢ c + (5) hoo + hon? + (5) Aygn nO + o(n?)]. (31) 


From (31), we calculate the light travel time Atrr between two observers (time 
delay)*! as 


1 1 : 1 é ; 
Atrr = (=) Jae! f — (5) hoo — hon? — (5) Ayn *n 5 ait own), (32) 


Choosing the z-axis along the initial light propagation direction, i.e. n\* = 
(0,0,1), and using (18) or (25) for a slow-motion observer and Sun (or a central 
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mass), we have 


Atrr = pee (<) f+ atu +o) — AtN + [oe AtGR 


2 (Z) ae eal o(S) 


(eo +)? + 29] 2 
x | t O(h*), (21 < 0,22 > 0 33 

{IG RAa|) TOWh Ga <Oe>0) ve 
where the first term is the Newtonian travel time AtN (Romer delay), the second 
term is the relativistic Shapiro time delay*! with AtG® the general relativistic 
Shapiro time delay, and b is the impact parameter of light propagation to the Sun. 


2.4. Light deflection 


The geodesic equation for light and for test particle in GR and in the metric theories 
of gravity 


g zl a dx? dx? _ ; : 
d D2 7 rs, (FF ) (= =0, A: affine parameter (34) 


can be cast in the form 


Integrating, we obtain 


CNC) L bw (E) a 


To obtain light deflection angle in a weak gravitational field of the Sun or other 
source, we choose x-axis in the initial light (photon) propagation direction, y-axis in 
the plane spanned by the Sun or other gravitational source and the light trajectory, 
and the sense of the z-axis is in the direction of the trajectory. From the w = y 
component of (36), we obtain 


1\ dy 
Joy Guy é dt 


Solving for dy/dt in (37), substituting (18) or (25) in and simplifying, we obtain 
= (Joy + Gey led 


1 d 
A deflection = (=) (#) 
«x0 


1 al 
= (5) / (Roo,y + hew,y + 2hox,y)cat + O(h?) 


wl wl 
1 
= (5) / (hoo,y + heey + 2hox,y)cdt + O(h). (37) 
xO x0 


wl 
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1 1 
|! =| GM |! 52 GMa 
- (x2 + b2)1/2 (x2 + b2)1/2 
“z=ax1 rz=x0 
GnM 
=-(14 of ae ) (e086 — cos 09). (38) 


for the deflection angle Aydeftection, Where 69 (01) is the angle between the position 
vector of the light emitter (observer) and z-axis. For star light and close impact, 
ie. b<11, we have 


c2b 
If cos 6, © 1 and y = 1, we obtain the usual formula of GR, i.e. AYgeftection = 
—4(GnM/c?b). 


GuM 
A Yaetection = (1 t y) ( ) (cos 0; + 1). (39) 


3. Solar System Ephemerides 


Planetary ephemerides are a must for precision tests of relativistic gravity in the 
solar system and for the orbit design of spacecraft and missions. Before the advent 
of space age, the analytical theories developed by Le Verrier, Hill, Newcomb, and 
Clemens on planetary motion had sufficient accuracy to account for ongoing optical 
observations. With the Doppler radio tracking of spacecraft and the radio/laser 
ranging to planets/Moon, the required accuracy of planetary ephemerides increased 
tremendously. The accuracy of analytical theories became inadequate. The usage 
and development of numerical methods started. 

Since the motion of planets and the moon are influenced by other planets/moon, 
to test relativistic gravity, one needs a complete solar system ephemeris. To do 
this, one would start with the PPN equations of motion in an appropriate gauge 
for celestial bodies. Because the separation of planets/moon are large compared 
with their sizes, one could treat the planets/moon as point particles with suitable 
multipole moments. Such a set of PPN equations of motion is the post-Newtonian 
barycentric equations of motion as derived in Brumberg” from the post-Newtonian 
barycentric metric with PPN parameters @ and y for solar system bodies. The 
metric with the gauge parameter a (not to be confused with Eddington parameter 
a in the last section) and v set to zero corresponding to a harmonic gauge adopted 
by the 2000 IAU resolution” is 
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R? a 
+O (2 (= =) 1) ae 
ry Ti 


+2671 9 (27 + 2)ts) - dered c ny (dx)? (40) 


7 


a 


where Pi = B- XV ijG = Vi Li, \ril, Tig = ar; = x;|,™i = GM;/c?, and 
M,’s are the masses of the celestial bodies with M, the solar mass.”* Jo is the 
quadrupole moment parameter of the Sun. Z is the unit vector in the direction of 
solar angular momentum. The associated equations of motion of N-mass problem 
derived from the geodesic variational principle of this metric (the effect of solar 
quadrupole moment not yet added) are 


; GM; 
#=->> Pe) frag + > m;(Agras + Bit), 


jx j#i 
e Fp he, 1 
Aig = 73 (y+ 1) 4 om (ruts) + Gly +2841) M+ 274 2)Mj) 
aj ay aj ay 
1 1 2 1 
+ S° GM, | (27+ 28)3— + (28-1)5 cas ) (41) 
re TigTik Vyiglik Tig’ ik 


By = = [C2 + 2)(righig) + (rigs). 
ij 

These equations can be used to build a computer-integrated planetary ephemeris 
framework. For a complete ephemeris, one needs to fit observational data to obtain 
solar and planetary parameters together with a set of initial conditions at a specific 
epoch; for a working ephemeris, one could simply adopt the parameters includ- 
ing planetary positions and velocities at some epoch from a complete fundamen- 
tal ephemeris. For example, in our working CGC 1 ephemeris (CGC: Center for 
Gravitation and Cosmology),’ we computer-integrated equations (41) with (40) 
(setting G = y = 1 for GR, and Jg = 2 x 1077 for the Sun) for eight-planets 
plus Pluto, the Moon, the Sun and the three big asteroids — Ceres, Pallas and 
Vesta (14-body evolution); the positions and velocities at the epoch 2005.6.10 0:00 
are taken from the DE403 ephemeris.’© The evolution (can go forward or back- 
ward in time) is solved by using the fourth-order Runge-Kutta method with the 
step size h = 0.01 day. Since tilt of the axis of the solar quadrupole moment 
to the perpendicular of the elliptical plane is small (7°), in CGC 1 ephemeris, 
we have neglected this tilt. In CGC 2 ephemeris,”’ we have added the perturba- 
tions of additional 489 asteroids. Such ephemerides can be used for mission orbit 
design/optimization and mission simulation. We used CGC 1 for orbit simulation 
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and parameter determination for Astrodynamical Space Test of Relativity using 
Optical Devices (ASTROD). Using this ephemeris as a deterministic model and 
adding stochastic terms to simulate noise, we generated simulated ranging data 
and used Kalman filtering to determine the accuracies of fitted relativistic and 
solar system parameters for 1050 days of the ASTROD mission.” This way, we 
simulated the accuracy achievable for the ASTROD mission concept. For a better 
evaluation of the accuracy of Cy G, we need also to monitor the masses of other 
asteroids. For this, we considered all known 492 asteroids with diameter greater 
than 65km to obtain an improved ephemeris framework — CGC 2, and calcu- 
lated the perturbations due to these 492 asteroids on the ASTROD spacecraft.”” 
More recently, we apply different variants of CGC 2 ephemeris framework to study 
the ASTROD I orbit design/optimization/simulation,’> the ASTROD-GW orbit 
design/optimization,”? and the numerical Time Delay Interferometry (TDI) for 
space gravitational-wave mission concepts ASTROD-GW,®° LISA®! and eLISA.®? 

At present, there are three series of complete fundamental ephemerides for 
the solar system — Development Ephemerides (DE),°° Intégrateur Numérique 
Planétaire de l’Observatoire de Paris (INPOP)** and Ephemerides of Planets and 
Moon (EPM).®° The major common feature of these three series of ephemerides 
are the simultaneous numerical integration of the equations of motion of the eight 
planets plus Pluto, the Sun, the Moon, and the lunar physical libration using the 
post-Newtonian approximation of GR in a harmonic coordinate system. In addi- 
tion, they take different number of asteroids and Trans-Neptunian objects (TNOs) 
in the integration of ephemerides. 

Let us illustrate with the EPM ephemerides.®° Specifically, the basic dynamical 
model of EPM2011 is the post-Newtonian equations of motion of the Sun, the 
Moon, the eight planets plus Pluto (now a TNO), and five largest asteroids with 
the following relatively weak gravitational effects taken into account: 


(a) perturbations from the known 301 of the most massive asteroids; 

(b) perturbations from the other minor planets in the main asteroid belt, modeled 
by a homogeneous ring; 

(c) perturbations from the known 21 largest TNOs; 

(d) perturbations from the other trans-Neptunian planets, modeled by a homoge- 
neous ring at a mean distance of 43 AU; 

(e) perturbations from the solar oblateness (2 x 1077). 


The main data set for the three current ephemerides to fit comes from the 
astrometric observations of planets and spacecraft. For EPM2011,®° it includes (i) 
optical observations of the outer planets and their satellites made from 1913 to 2011 
(57560 data points); (ii) radar observations of Mercury, Venus, and Mars from 1961 
to 1997 (58112 data points); (iii) radio data provided by spacecraft from 1971 to 
2010 (561998 data points). 

From (41) with (40) or equivalent formulas and observations to obtain a com- 
plete ephemeris, one needs to fit for the mass parameters, relevant multipole 
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moments, initial positions and initial velocities of the planets, the Moon and the 
Sun together with some other solar system bodies like the three largest asteroids — 
Ceres, Pallas and Vesta. Before 2009, in fitting the data for ephemerides, instead 
of the mass parameter GMgyy of the Sun, one could use the following relation to 
fit or adjust the astronomical unit (au): 


GMgun(m?s~”) = k?au(m?)/864007(s”) (42) 


with & = 0.017 202 098 95, the Gaussian gravitational constant. The astronomi- 
cal unit is a basic unit in astronomy and was supposed to be close to the mean 
Earth distance to the Sun. With the development of ranging observations in the 
solar system, it could be related precisely to the SI meter through the ephemeris 
fitting. Standish determined the au to be 149 597 870 697.4m when worked out 
DE410 in 2003.°° Pitjeva’’ determined the au to be 149 597 870 696.0 m when 
worked out EPM2004. The difference of 1.4m represents the realistic error in the 
determination of the au. In 2009, Pitjeva and Standish** proposed to the IAU Work- 
ing group on Numerical Standards for fundamental Astronomy (NSFA) the masses 
of three largest asteroids (Ceres, Pallas and Vesta), the ratio of Moon’s mass to 
Earth’s mass, and the au from the fitting/adjustment of DE421°9 and EPM2009.%° 
In this determination from ephemerides, the DE421°® value of au is 149 597 870 
699.6.£0.15 m and the EPM 2009°° value of au is 149 597 870 696.6 + 0.1m with 
the quoted uncertainty formal uncertainty. They estimated the realistic error to be 
3m. From this result, they proposed to adopt the numerical value of the au in meter 
to be 149 597 870 700 (3) m. They also concluded that the numerical value of the au 
in meters is identical in both the TDB-based Barycentric Dynamical Time (TDB) 
and the TCB-based Barycentric Coordinate Time (TCB) systems of units if one 
uses the conversion proposed by Irwin and Fukushima,®! Brumberg and Groten,9? 
and Brumberg and Simon.®? This value of au was accepted and included by the 
XXVUth IAU General Assembly at Rio de Janeiro in 2009 as part of the [AU (2009) 
System of Astronomical Constants (Resolution B2).% 

In the XXVIIIth IAU General Assembly at Beijing in 2012, the astronomical 
unit in meters is changed from a fitted value to a defining constant similar to speed 
of light: 1 au = 149 597 870 700m. The abbreviation of the astronomical unit should 
be au (lower case). In the fitting of ephemeris, GMs, should then be used instead 
of au. The current values from the ephemeris fitting are: DE430,°°> GMgun = 132 
712 440 042(10) km? /s?; INPOP13b,c,°° GMgun = 132 712 440 044.5(0.2) km3/s?; 
EPM2014,°° GMgun = 132 712 440 053(1) km3/s?. 

In the actual testing of relativistic theories of gravity, one fits additional PPN 
or relativistic parameters. In the next section, we compile these ephemeris tests 
together with other solar system tests. 


4. Solar System Tests 


For last 50 years, we have seen great advances in the dynamical testing of rela- 
tivistic gravity. This is largely due to interplanetary radio ranging/tracking and 
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Lunar Laser Ranging (LLR). Interplanetary radio ranging and tracking provided 
more stimuli and progresses at first. However, with improved accuracy of 2mm 
from 20 to 30cm and long-accumulation of observation data, LLR reaches similar 
accuracy in determining relativistic parameters as compared to interplanetary radio 
ranging despite that in LLR the relativistic effects are weaker. Table 2 gives such a 
comparison. 

Relativistic perihelion advance and the solar quadrupole moment. In the PPN 
equations of motion (41) with two PPN parameters @ and y and with the effect 
of solar quadrupole moment added, the solar contribution to the secular planetary 
perihelion advances is given by the well-known formula (see e.g. Ref. 47, p. 1116): 


a "| lean? lea eh 


Ages | (43) 
where a and e are the semimajor axis and the eccentricity of the planet orbit. If Sun 
is uniformly rotating throughout, Jz would be about 2 x 107 and its magnitude 
amounts to about 0.05% of the general relativistic perihelion advance for Mercury. 
To measure or separate the relativistic term (the first term), one needs to know 
or measure the solar quadrupole parameter. There are three ways to measure the 
solar quadrupole parameter: (i) through the solar oblateness measurement based 
on brightness of solar surface; (ii) through the helioseismology measurement; (iii) 
through the measurement of perihelion advance of different planets and asteroids. 
Although the solar oblateness measurement up to 1980’s might imply large solar 
quadrupole parameter,” the determination of internal rotation of the Sun through 
measurement of rotation-induced frequency splitting in the observed solar surface 
acoustic power spectrum about the same period of time gave the value Jg = (1.7+ 
0.4) x 107", rather close to the value for a uniformly rotating Sun.°° 

In the 12th International Conference on GR and Gravitation at Boulder in 
1989, Shapiro”? reported the cumulative measurement of the relativistic Mercury 
perihelion advance to be (42.98 + 0.04)/century assuming the solar quadrupole 
parameter Jz to be about 2 x 1077; this gives G = 1.000 + 0.003. 

In the 1990’s, the solar quadrupole moment issue basically settled: (i) The solar 
oblateness ¢ measured in a balloon flight of the Solar Disk Sextant (SDS) on 1992 
September 301° is (8.63 + 0.88) x 10~° with the inferred Jo = 346 x 1077 (in 
agreement with the measured and inferred values ¢ = (9.6 + 6.5) x 107-6 and Jz = 
10 + 43 x 10-7 of Hill and Stebbins!®! in 1975). A subsequent analysis!©? based 
on SDS balloon flight data both in 1992 and 1994 combined with solar surface 
angular rotation data gives the solar quadrupole moment parameter Jz = 1.8 x 1077 
and the solar octopole moment parameter Jy = 9.8 x 10~. (ii) The high quality 
helioseismological data obtained from the Solar Heliospheric Satellite (SoHO) and 
from the Global Oscillations Network Group (GONG) had made a much better 
determination of solar internal structure and solar differential rotation possible; this 
in turn led to a good determination of solar quadrupole moment and solar angular 
momentum. Pijpers!®? did an analysis and obtained Jz = (2.14+0.09) x 107" from 
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the GONG data, and Jz = (2.23+ 0.09) x 10-7 from the SoHO/MDI data, with 
an error-weighted mean Jy = (2.18 + 0.06) x 10~%. Godier and Roselot'®* used a 
differential rotation model established from helioseismological data and integrated 
J from core to the surface to obtain a slightly lower value of Jz = 1.60 x 1077. 

The results of space-borne measurements of solar oblateness from 1997 to 2011 
are basically giving consistent numbers as summarized by Meftah et al. in 20151: 
SoHO/MDI by Emilio et al. in 2007!°° with the oblateness (the solar equator-to- 
pole radius difference) Ar = 8.7+ 2.8 mas using 676.78nm (A) observation (2007), 
RHESSI/SAS by Fivian et al. in 2008!” with Ar = 8.01+0.14 mas using 670.0nm 
observation in 2004, SDO/HMI by Kuhn et al. (2012)!°8 with Ar = 7.2+0.49 mas 
using 617.3nm observation (2011-2012), Picard/SODISM by Irbah et al. (2014)1°9 
with Ar = 8.4+0.3 mas using 535.7nm observation (2011), Picard/SODISM by 
Meftah et al. (2015)1°° Ar = 7.86 + 0.32 mas using 782.2nm observation (2010-— 
2011). It is to be noted that Ar = 8 mas corresponds to Jz = 1.60 x 107°. 

The third way (iii) to measure the solar quadrupole moment is through its 
gravitational field generated. This will be discussed in the following together with 
the ephemeris fitting. 

Test of relativistic gravity through ephemeris fitting. As planetary ephemerides 
became more and more precise, Anderson et al.!!° in 2002, used JPL archive of 
planetary positional data and DE ephemerides fitting method to solve all the con- 
ventional parameters in the DE ephemeris, plus four more parameters (3, y, Jo and 
GC /G specific to test relativistic gravity. In fitting the data, they weighted the sepa- 
rate data sets, except four data sets for the Mars, such that the assumed standard 
error for each data set is equal to the RMS residual for that particular set after 
the fit. For Mars, they used a standard error equal to five times the RMS residual 
for each of the four data sets — orbit data from Mariner 9 (1971-1972), lander 
data from Viking (1976-1982), orbit data from Mars global Surveyor (1998-2000) 
and Lander data from Pathfinder (1997), to compensate for systematic error from 
asteroids perturbations. This way they interpreted their resulting parameter values 
after fit as realistic errors instead of formal errors. The results of their fitting!!° are 
B = 0.9990+0.0012, 7 = 0.9985+0.0021, J2 = (2.3+5.2) x 10-7, and Gig = ae (1 
1.8) x 107!?/yr (the G/G value is the same as their previous result in Ref. 111). 

As an application of the developing EPM ephemerides, Pitjeva®’ (EPM2004 
fitting) in 2004 obtained a determination of 3 and y simultaneously with estimations 
for the solar quadrupole parameter and the possible variability of the gravitational 
constant: G = 1.0000 + 0.0001, y = 0.9999 + 0.0001, Jo = (1.9 + 0.3) x 1077 and 
G/G = (1£5)x107!4/yr. In working out INPOP2010a planetary ephemeris, Fienga 
et al.'!? tested relativistic gravity by fitting 3 or y, and obtained: 6 = 0.999959 + 
0.000078; 7 = 1.000038 + 0.000081; Jz = (2.4+0.25) x 10-7. Pitjeva in working out 
EPM2011 ephemerides in 2013°° obtained G = 0.99998 + 0.00003, y = 1.00004 + 
0.00006, Jz = (2.0 40.2) x 1077 and [d(G@Mgun)/dt]/GMgun = (—5.0 + 4.1) x 
10~14/yr. Verma et al.'!® included the radio ranging observations of MESSENGER, 
improved our knowledge of the orbit of Mercury, obtained INPOP138a ephemeris, 
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and used it to perform tests of relativistic gravity. Their estimations of parameters 
are: 3 = 1.000002 + 0.000025; y = 0.999997 + 0.000025 (3 = 1 fixed); Jg = (2.4+ 
0.2) « 10°". 

Fienga et al.!'4 added supplementary range tracking data obtained from the 
analysis of the MESSENGER spacecraft from 2011 to 2014 and included in their 
INPOP 15a planetary ephemerides the new JPL datasets!’ obtained after the new 
analysis of Cassini tracking data from 2004 to 2014. They use INPOP 15a to esti- 
mate possible supplementary advances of perihelia for Mercury and Saturn to test 
GR and presented their results in the 14th Marcel Grossmann meeting.'!? The 
results are basically consistent with the previous analysis; no violations of GR are 
found. 

Using analytic and numerical methods, Anderson et al.t'® demonstrated that 
Earth—Mars ranging could provide a useful estimate of the SEP parameter 7. For 
Mars ranging measurements with an accuracy of 0 meters for 10 years, the expected 
accuracy for the Nordtvedt SEP parameter 7 would be of order (1 — 12) x 10-40 
according to Ref. 116. The SEP for the Earth-Mars—Sun—Jupiter system is probably 
already tested implicitly in the ephemeris fit. It remains to separate the effect of 
SEP violation in the fit. 

Time variability of the gravitational constant and mass loss from the Sun. The 
solar system dynamics could measure the possible time variability of the gravita- 


tional constant and the mass change of the Sun when the precision becomes good. 
If the gravitational constant does not change or its change is measured in another 
way, the mass loss (change) from the Sun can be measured dynamically. We advo- 
cate this potential during the 1990’s when we propose the concept of ASTROD.1!7 
The electromagnetic radiation of the Sun carries 6.8 x 107'4 fractional mass from 
the Sun each year. This is the largest mass change of the Sun. Other mass change 
mechanisms give similar fractional change but of smaller magnitude. In order to 
separate the time variability of the gravitational constant, estimation of the mass 
loss from the Sun and the mass accretion into the Sun is needed. Pitjeva and Pit- 
jev'!§ made an estimate of the mass of celestial bodies falling into the Sun (mainly 
comets) and gave the following annual upper limit: 


Meomet —[4 
——_ < 3.2x 10°". 44 
Msun - * ( ) 


Combined with the estimate of annual solar wind loss of (2-3) x10~!*Mgun per year 
(See Ref. 118 for references), Pitjeva and Pitjev gave the following bounds on the 
annual mass loss Mos, of Sun: 

Mhoss 


-9.8x 10-4 < —™ <-36x10-™. (45) 
Msun 
The fitted value of 
[Area 
dt 
=(-5.044.1)x107-/yr, (3c) (46) 


GMsun 


Solar-system tests of relativistic gravity 1-389 


from EPM®:118 led to the following relation!’ with 95% confidence (2c) level 
—14 Gs. -14 
—7.8 x 107° /yr < G + Mgun/Msun | < —2.3 x 107°*/yr, (20). (47) 


Equation (47) together with (45) gave bound on G/G!'8 as 


—4,2 x 1074/yr < A £475 «10° /yr, (48) 


More recently, Fienga et al.'!® used Monte Carlo simulations to find constraints 


on the possible variation of the gravitational constant. They deduced the values of 
G/G considering a fixed value for annual mass loss of the Sun (including radiation 
and solar winds): 


Mioss 
Msun 


extracted from solar physics measurements and variations of Mioss/Msun during 
the 11-year solar cycle of Pinto et al.!2° The values of G/G are typically within 
+10 x 107". 

Solar system dynamics also constrains dark energy models. For interested read- 
ers, please see Refs. 121 and 122. 

Light/radio wave deflection, Shapiro time delay and constraint on the Eddington 
parameter y. As we have seen in Sec. 1, gravitational light deflection is one of the 
three classical tests of GR. Before the ephemeris determination of the Eddington 
(light deflection) parameter 7, the Very Long Baseline Interferometry (VLBI) mea- 
surement of the gravitational deflection of radio waves by the Sun from astrophys- 


= (6.52145) xi0™, (49) 


ical radio sources had been an important method. The accuracy of the observation 
had been improved to 1.7 x 107° for y (Robertson et al.,1?3 Lebach et al.1?+) in 
1995. Analysis using VLBI data from 1979-1999 improved the result by about four 
times to 0.99983 + 0.00045 (Shapiro et al.!?°). Fomalont et al.!2° used the Very 
Long Baseline Array (VLBA) at 43, 23 and 15 GHz to measure the solar gravita- 
tional deflection of radio waves among four radio sources during an 18-day period 
in October 2005 and determined the Eddington parameter 7 to be 0.9998 + 0.0003. 
Fomalont and Kopeikin!?”!?8 measured the effect of retardation of gravity by the 
field of moving Jupiter via VLBI observation of light bending from a quasar. 


In 2003, Bertotti et al.!?° reported a measurement of the frequency shift of radio 
photons due to relativistic Shapiro time delay effect from the Cassini spacecraft 
as they passed near the Sun during the June 2002 solar conjunction. From this 
measurement, they determined y to be 1.000021 + 0.000023. 

With the Hipparcos mission, very accurate measurements of star positions at 
various elongations from the Sun were accumulated.'8° Most of the measurements 
were at elongations greater than 47° from the Sun. At these angles, the relativistic 
light deflections are typically a few mas; it is 4.07mas at right angles to the solar 
direction for an observer at 1 AU from the Sun according to GR. In the Hipparcos 


Table 


Relativity-parameter 


lunar/satellite laser ranging. 


determination from interplanetary radio ranging/tracking and from 


Parameter Meaning Value from solar system determinations Value from 
and from gravity probe B lunar/satellite 
laser ranging 
B PPN®*® .000 + 0.003%? (Perihelion shift with 
Nonlinear J2(Sun) = 10-7 assumed) 1.003 + 0.00513 
gravity 0.9990 + 0.001219 (solar system tests 1.00012 + 0.0011136,129 
with Jo(Sun) = (2.3 + 5.2) x 1077 fitted) (with Cassini + ) 
.0000 + 0.000187 (EPM2004 fit) 1.00017 + 0.00015158 
0.999959 + 0.000078!!? (INPOP 10a fit) 1.00006 + 0.00011138 
0.99998 + 0.000038° (EPM2011 fit) (from 7) 
y PPN®5 .000 + 0.00299 (Viking ranging 
Space curvature time delay) 
0.9985 + 0.00211! (solar system tests) 
.000021 + 0.000023!29 (Cassini S/C ranging) 1.000 + 0.005135 
0.9999 + 0.000187 (EPM2004 fit) 
0.9998 + 0.0003!26 (VLBI deflection) 
.000038 + 0.000081!!2 (INPOP10a fit) 
.00004 + 0.000068> (EPM2011 fit) 
Kgp Geodetic precession 0.99935 + 0.0028!49 (gravity probe B) 0.997 + 0.007185 
0.9981 = 0.006418¢ 
0.997 + 0.005138 
Kp-r Lense— 0.95 + 0.19'49 (gravity probe B) 0.994 + (0.1-0.3)146 


Thirring effect 


(LAGEOS) 
0.994 + 0.002 + 0.05159 
(LAGEOS & LARES) 
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Table 2. (Continued) 


Parameter Meaning Value from solar system determinations Value from 
and from gravity probe B unar/satellite 
laser ranging 


E (7) SEP The SEP for the Earth—Mars— (3.2446) « 10-41 
(Nordtvedt parameter) Sun—Jupiter system!!® js probably (-2. 2.0) « 19718 186.145 
already tested implicitly in the (-0. 1,3) < 10-4 
ephemeris fit. It remains to separate (0.9 1,9) x 10713188 
the effect of SEP violation. (n = (1+ 3) x 107-4139) 
G/G Temporal change inG = +(1.1 — 1.8) x 10~1?/yrt44 (14:8) x 10-1 fy 
(Solar System Tests) (449) x 10713 /yr186 
(1 +5) x 10714/yr.8” (EPM2004 fitting) G44 15) x 10-4 foe 
(—4.2 to 7.5)x10—14/yr!8 (Planets & S/C 
observations with solar mass loss estimate) 
+10 x 10-14/yr!19 (INPOP & Monte Carlo with 
solar mass loss estimate) 
G/G Temporal change in G (44£5) x 107} yr—2140 
OY ukawa Intermediate a4 —1 Seu = DE 18) @ 10-1 &)=380 000 km 
range force (perihelion of Mars) = (—0.6 + 1.8) x 10711138 


fipans6 aysiaynjas fo sysaz waysfis-10j05 


T6E-1 


1-392 W.-T. Ni 


measurements, each abscissa on a reference great-circle has a typical precision of 
3 mas for a star with 8-9 mag. There are about 3.5 million abscissae generated, and 
the precision in angle or similar parameter determination is in the range. Froeschlé 
et al.'3! analyzed these Hipparcos data and determined the light deflection param- 
eter y to be 0.997 + 0.003. This result demonstrated the power of precision optical 
astrometry. 

Global Astrometric Interferometer for Astrophysics (Gaia) 
astrometric mission aiming at the broadest possible astrophysical exploitation of 
optical interferometry using a modest baseline length (~ 3m). Gaia, launched 
on 19 December 2013 by Arianespace using a Soyuz ST-B/Fregat-MT rocket 
flying from Kourou in French Guiana in a Lissajous orbit around the Sun— 
Earth Lagrangian point Lg, is charting a three-dimensional map of our Galaxy, the 
Milky Way, in the process revealing the composition, formation and evolution of 
the Galaxy. Operating in the depths of space, far beyond the Moon’s orbit, ESA’s 
Gaia spacecraft had completed two years of a planned five-year survey of the sky on 
16 August 2016. Data Release 1 (Gaia DR1)!%8 was already public and contained 
astrometric results for more than 1 billion stars brighter than magnitude 20.7 based 
on observations collected by the Gaia satellite during the first 14 months of its 
operational phase. Gaia has already provided unprecedented positional and radial 
velocity measurements with the accuracies needed to produce a stereoscopic and 
kinematic census of about one billion stars in our Galaxy and throughout the Local 
Group. This amounts to about 1% of the Galactic stellar population. To increase 
the weight of measuring the relativistic light deflection parameter 7, Gaia observes 
at elongations greater than 35° (as compared to essentially 47° for Hipparcos) from 
the Sun. A simulation shows that GAIA could measure 7 to 1 x 107°-2 x 1077 
accuracy.!83 

LLR Tests of relativistic gravity. In the last column of Table 2, the values come 
from LLR observations.!*° 14° Reference 135 gave the results as of 1996. In Ref. 136, 
Williams et al. used a total of 15 553 LLR normal-point data in the period of March 
1970 to April 2004 from Observatoire de la Cote d’Azur, McDonald Observatory 
and Haleakala Observatory in their determination. Each normal point comprises 
from 3 to about 100 photons. The weighted rms scatter after their fits for the ten- 
year ranges from 1994 to 2004 is about 2cm (about 5 x 1071! of range). Miiller 
et al. wrote a comprehensive chapter on “Lunar Laser Ranging and Relativity” and 
summarized their work on the LLR tests of the relativistic gravity.1°* From Table 2, 
we can see clearly that the LLR tests of relativistic gravity have the same level of 
precision as the radio solar system tests. Constraints on intermediate range force 
is from LLR1!°8 and from the Mars perihelion precession (Iorio') 
the last row. 

LLR also constrains dark energy models. For interested readers, please see 
Ref. 142 and references therein. 

Frame Dragging Effects. In 1918, Lense and Thirring!* predicted that a rotat- 
ing body drags the local inertial frames of reference around it in GR. In 1960, 
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Schiff!4° showed that in GR, the spin axis of a gyroscope orbiting around Earth 
would undergo both geodetic drift in the orbit plane due to motion through the 
spacetime curved by the Earth’s mass and frame-dragging due to the Earth’s rota- 
tion with respect to a distant inertial frame. The dragging of gyro’s spin axis is 
sometimes called the Schiff effect while both spin axis dragging and orbiting axis 
dragging can be grouped as Lense—Thirring frame-dragging effects. In 2004, Ciu- 
folini and Pavlis!*° reported a measurement of the Lense-Thirring effect on the two 
Earth satellites, LAGEOS and LAGEOS2; it is 0.99 +0.10 of the value predicted by 
GR. In the same year, Gravity Probe B (GP-B, a space mission to test GR using 
cryogenic gyroscopes in orbit)!47 was launched in April'4%; their final results are 
a geodetic drift rate of —6, 601.8 + 18.3 mas/yr and a frame-dragging drift rate of 
—37.2+7.2mas/yr, to be compared with the GR predictions of —6, 606.1 mas/yr and 
—39.2 mas/yr, respectively; i.e. GP-B!8 provides independent measurements of the 
geodetic and frame-dragging effects at an accuracy of 0.28% and 19%, respectively. 
GP-B experiment has also verified the weak equivalence principle for macroscopic 
rotating bodies to ultra-precision.!4° Recently, Ciufolini et al.!°° have used about 
3.5 years of laser-range observations of the LARES, LAGEOS, and LAGEOS2 satel- 
lites together with the Earth gravity field model GGMO05S produced by the space 
geodesy mission GRACE to measure the Earth’s dragging of inertial frames to be 
0.994 + 0.002 + 0.05 of the GR value with 0.002 the 1-c formal error and 0.05 their 
preliminary estimate of systematic error. 


5. Outlook — On Going and Next-Generation Tests 


In the early days, astronomical observations of the solar system provided the basis 
for developing gravitation theories. Gravitation theories provide the scientific basis 
of space exploration of Earth and the entire solar system. The advent of space age 
and solar system exploration required the range measurements in the solar system 
that made possible the creation of high-accuracy planetary and lunar ephemerides. 
These ephemerides in turn provide dynamical positioning atlases for the solar sys- 
tem exploration and the precision tests of relativistic gravitational theories. As 
we have seen in the last section, ephemeris fitting for gravitational parameters in 
relativistic gravitational theories is playing more and more important role in the 
experimental tests. Experimentally, the improvement depends on the technological 
advance of radio ranging/Doppler tracking and laser ranging/tracking of spacecraft 
and celestial bodies in the solar system. 

In Table 2, we have seen that LLR reaches similar accuracy in determining 
relativistic parameters as compared to interplanetary radio ranging despite that 
in LLR, the relativistic effects are weaker. The main reason is that the resolution 
depends on wavelength. Optical wavelength is four orders of magnitude shorter 
than microwave wavelength. The most precise radio Doppler tracking experiment 
is Cassini radio wave retardation measurement.!?9 Cassini multilink radio system 
consists of a sophisticated multilink radio system that simultaneously receives two 
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uplink signals at frequencies of X and Ka bands and transmits three downlink 
signals with X-band coherent with the X-band uplink, Ka-band coherent with the 
X-band uplink, and Ka-band coherent with the Ka-band uplink. X-band is a stan- 
dard deep space communication frequency band about 8.4 GHz; Ka-band is another 
deep space communication frequency band about 32 GHz. The wavelength of Ka- 
band microwave is about 1cm. The reason to use multilink system is to measure 
and subtract the plasma dispersion which is proportional to the wavelength square. 
For laser optical ranging, a typical wavelength is about 1m. There is a four- 
order difference in wavelength. For laser ranging, the plasma effect is eight-order 
smaller; in the interplanetary space the subtraction is not needed. If one link is on 
Earth, subtraction of extra optical path length by two-wavelength observation or 
other means is still needed. With four-order improvement in ranging, monitoring, 
the noninertial spacecraft motion is required. One way is to use drag-free technol- 
ogy. LISA Pathfinder launched on 3 December 2015 has successfully tested and 
demonstrated the drag-free technology to satisfy not just the requirement of LISA 
Pathfinder, but also basically the drag-free requirement of LISA gravitational-wave 
space mission concept.!°! The drag-free technology is ripe for relativistic missions 
in the solar system. Hence, we envisage a 3-4 orders of improvement in testing the 
relativistic gravity and the solar system dynamics, say, in the next 25 years or so. 
This improvement is for all relativistic parameters. In the following, we give an out- 
look of improvements on the Eddington parameter y for various ongoing /proposed 
experiments. Table 3 lists the aimed accuracy of such experiments. Some motiva- 
tions for determining y precisely to 10~°-10~® are given in Refs. 152 and 153. 

First, as we have discussed in Sec. 4, Gaia Data Release 1 (Gaia DR1)'*? has 
already become public and contained astrometric results for more than 1 billion 
stars brighter than magnitude 20.7 based on observations collected by the Gaia 
satellite during the first 14 months of its operational phase. With expected 4-year 
observation period, a simulation shows that GAIA could measure y to 1 x 107°— 
2x 1077 accuracy.!** This is listed as the second row in Table 3. 

BepiColombo is a joint mission to Mercury!®* between ESA and the Japan 
Aerospace Exploration Agency (JAXA), executed under ESA leadership. The mis- 
sion comprises two spacecrafts: The Mercury Planetary Orbiter (MPO) and the 


Table 3. Aimed accuracy of PPN space parameter y for various ongo- 
ing/proposed experiments. 


Ongoing/Proposed experiment Aimed accuracy of y Type of experiment 


GATA 182-134 1 x 10-5-2 x 10-7 Deflection 

Bepi-Colombo!54:155 2x 10-6 Retardation 
ASTROD [115 3x 1078 Retardation 
ASTROD!!7 1x 10-9 Retardation 
Super-ASTROD?!61 1x 107-8 Retardation 
Odyssey 16? 1 x 10-7 Retardation 
SAGAS!63 1x 1077 Retardation 


oss!64 110" Retardation 
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Mercury Magnetospheric Orbiter (MMO). It will set off in 2018 on a journey to the 
smallest and least explored terrestrial planet in our Solar System. When it arrives 
at Mercury in late 2024, it will endure temperatures in excess of 350°C and gather 
data during its 1-year nominal mission, with a possible 1-year extension. Milani 
et al.'°° have simulated the radio science of this mission: “While determining its 
orbit around Mercury, it will be possible to indirectly observe the motion of its 
center-of-mass, with an accuracy of several orders of magnitude better than what is 
possible by radar ranging to the planet’s surface. This is an opportunity to conduct 
a relativity experiment which will be a modern version of the traditional tests of 
GR, based upon Mercury’s perihelion advance and the relativistic light propagation 
near the Sun.” They predict that the determination of y can reach 2 x 107°. 

ASTROD 1 is envisaged as the first in a series of ASTROD missions.”°156 159 
ASTROD I mission concept is to use one spacecraft carrying a telescope, four lasers, 
two event timers and a clock with a Venus swing-by orbit. Two-way, two-wavelength 
laser pulse ranging will be used between the spacecraft in a solar orbit and deep 
space laser stations on Earth, to achieve the ASTROD I goals of testing GR with 
an improvement in sensitivity of over three orders of magnitude, improving our 
understanding of gravity and aiding the development of a new quantum gravity 
theory; to measure key solar system parameters with increased accuracy; and to 
measure the time rate of change of the gravitational constant with improvement. 
Using the achieved accuracy of 3ps in laser pulse timing and the demonstrated 
LISA Pathfinder drag-free capability, a simulation showed that accuracy of the 
determination of y will reach 3 x 107°. 

The general concept of ASTROD is to have a constellation of drag-free space- 
craft navigate through the solar system and range with one another using optical 
devices to map the solar system gravitational field, to measure related solar system 
parameters, to test relativistic gravity, to observe solar g-mode oscillations, and 
to detect gravitational waves. A baseline implementation of ASTROD, also called 
ASTROD, is to have two spacecraft in separate solar orbits (one in inner solar 
orbit, the other in outer solar orbit), each carrying a payload of a proof mass, two 
telescopes, two 1—2 W lasers, a clock and a drag-free system, together with a similar 
spacecraft near Earth around one of the Lagrange points L1/L2. The three space- 
craft range coherently with one another using lasers to map solar system gravity, to 
test relativistic gravity, to observe solar g-mode oscillations, and to detect gravita- 
tional waves. Since it will be after ASTROD I, we assume 1 ps timing accuracy and 
10 times better drag-free performance than what LISA Pathfinder achieved. With 
these requirements, the accuracy of the determination of y will reach 1 x 107° in 
3.5 years.160 

Super-ASTROD,!®! Odyssey,!®? SAGAS (Search for Anomalous Gravitation 
using Atomic Sensors)!©? and OSS (Outer Solar System)'4 
cepts to test fundamental physics and to explore outer solar system. 

Solar System Odyssey! is designed to perform a comprehensive set of gravita- 
tional tests in the Solar System. The mission has four major scientific objectives: (1) 


are four mission con- 
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significantly improve the accuracy of deep space gravity test; (2) investigate plan- 
etary flybys; (3) improve the current accuracy of the measurements of the Edding- 
ton parameter; (4) map the gravity field in outer regions of the Solar System. For 
improving the current accuracy of the measurement of the Eddington parameter 
7, Odyssey proposes to use an improved multi-frequency radio link of the Cassini 
type together with a precision accelerometer and a possible laser tracking option 
and aims at measuring y at an accuracy of 1077. 

SAGAS? aims at flying highly sensitive atomic sensors (optical clock, cold 
atom accelerometer, optical link) on a Solar System escape trajectory. It also aims 
at measuring y at an accuracy of 1 — 2 x 1077. 

OSS! is an outer solar system exploration mission concept. The OSS probe 
would carry instruments allowing precise tracking of the spacecraft during the 
cruise. It would facilitate improved tests of the laws of gravity in deep space. A 
largely improved accuracy can be attained with the up-scaling option of a laser 
ranging equipment onboard to measure the parameter y at the 10~7 level. 

Super-ASTROD!*! is a mission concept with four spacecraft in 5 AU orbits 
together with an Earth-Sun L1/L2 spacecraft ranging optically with one another 
to probe primordial gravitational waves with frequencies 0.1 4«Hz—1 mHz, to test fun- 
damental laws of spacetime and to map the outer-solar-system mass distribution 
and dynamics. With larger orbits, the main goal of Super-ASTROD in test relativis- 
tic gravity is not to improve on Parametrized Post-Newtonian (PPN) parameters 
over ASTROD I / ASTROD, instead it is to test cosmological theories which give 
larger modifications from GR for larger orbits. However, with same or better rang- 
ing capability than ASTROD I / ASTROD, the accuracy of its determination of 7 
will be better than 1 x 1078. 

All four mission concepts explore gravity at deep space to bridge the gap between 
inner solar system tests and cosmological tests. They are most relevant to the 
detection/constraint of dark matter and dark energy, and to the tests of MOND 
models and the dark energy dynamical models. 

Since we had another review on “Equivalence principles, spacetime structure 
"17 early this year, we did not discuss space missions for 
testing (weak) equivalence principles. Here we just mention in passing that Micro- 
scope (MICROSCOPE: MICRO-Satellite 4 trainée Compensée pour l’Observation 
du Principle d’Equivalence)!65166 has been in orbit since 26 April 2016 with the 
aim of improving the test accuracy by two orders of magnitude than any of the 
ground-based weak-equivalence-principle experiment, and is performing functional 
tests successfully. !6 

With increasing outreach and precision of observations, astrophysics and cos- 
mology became increasingly important for developing gravitation theories; notably 
the precise timing of pulses from pulsars** and various cosmological tests.!” 

During last 157 years, the precisions of laboratory and space experiments, 
and the precisions of astrophysical and cosmological observations on the tests of 
relativistic gravity have improved by 3-4 orders of magnitude. The advent of space 
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age has stimulated the development of numerical ephemerides. Doppler and ranging 
observations from various space missions drive the ephemerides to ever increasing 
precision. For the last decade, we have seen great progress in various aspects of 
testing relativistic gravity in the solar system. Systematic modeling and ephemeris 
fitting of all the observational data becomes standard. The pending testing of rel- 
ativistic gravity better than 10~°-10~° precision requires the development of 2PN 
(post—post-Newtonian) numerical ephemerides. In the next 25 years, we envisage 
another 3—4 order improvement in all directions of tests of relativistic gravity. These 
will give enhanced interest and development both in experimental and theoretical 
aspects of gravity, and may lead to answers to some profound questions of gravity 
and the cosmos. 

Gravitation is clearly empirical. As precision is increased by orders of magnitude, 
we are in a position to explore deeper into the origin of gravitation. The current 
and coming generations hold such promises. 
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Pulsars are wonderful gravitational probes. Their tiny size and stellar mass give their 
rotation periods a stability comparable to that of atomic frequency standards. This 
is especially true of the rapidly rotating “millisecond pulsars” (MSPs). Many of these 
rapidly rotating pulsars are in orbit with another star, allowing pulsar timing to probe 
relativistic perturbations to the orbital motion. Pulsars have provided the most stringent 
tests of theories of relativistic gravitation, especially in the strong-field regime, and 
have shown that Einstein’s general theory of relativity is an accurate description of 
the observed motions. Many other gravitational theories are effectively ruled out or at 
least severely constrained by these results. MSPs can also be used to form a “Pulsar 
Timing Array” (PTA). PTAs are Galactic-scale interferometers that have the potential 
to directly detect nanohertz gravitational waves from astrophysical sources. Orbiting 
super-massive black holes in the cores of distant galaxies are the sources most likely 
to be detectable. Although no evidence for gravitational waves has yet been found in 
PTA data sets, the latest limits are seriously constraining current ideas on galaxy and 
black-hole evolution in the early universe. 


Keywords: Gravity; gravitational waves; pulsars; pulsar timing; general relativity. 


1. Introduction 


Pulsars are rotating neutron stars that emit beams of radiation which sweep across 
the sky as the star rotates. A beam sweeping across the Earth may be detected as a 
pulse that repeats with a periodicity equal to the rotation period of the star. Because 
of the large mass of neutron stars, ~ 1.4 Mo, and their tiny size, radii ~ 15 km, (see 
Ref. 76 for a review of neutron-star properties) the rotation period of neutron stars 
is incredibly stable, with a stability comparable to that of the best atomic clocks 
on Earth. This great period stability, combined with the fact that pulsars are often 
in a binary orbit with another star, makes them wonderful probes of relativistic 
gravity. Tiny perturbations to their period resulting from, for example, relativistic 
effects in a binary orbit, may be detected and compared with the predictions of a 
gravitational theory. Pulsars may also be used as detectors for gravitational waves 
passing through the Galaxy. To separate the effects of gravitational waves from 
other perturbations, signals from pulsars in different directions on the sky must be 
compared — exactly analogous to the way laser-interferometer gravitational-wave 
detectors compare laser phases in perpendicular arms. 
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More than 2400 pulsars are now known.* The vast majority of them reside 
in our Galaxy, typically at distances of a few thousand light-years from the Sun. 
Their pulse periods, P, range between 1.4 milliseconds and 12s and fall into two 
main groups. The so-called “normal” pulsars have periods longer than about 30 
milliseconds and the “millisecond” pulsars (MSPs) have shorter periods. MSPs 
comprise about 15% of the known pulsar population. 

Although pulsar periods are very stable, they are not constant. In their own 
reference frame, all pulsars are slowing down, albeit very slowly. Pulsars are powered 
by their rotational energy. They have extremely strong magnetic fields and, as they 
spin, they emit streams of relativistic particles and so-called “dipole radiation”, 
electromagnetic waves with period equal to the rotation period of the star. These 
carry energy away from the star resulting in a steady increase in the pulse period. 
Typical rates of period increase, P, are a part in 10!° for normal pulsars and much 
less for MSPs. Assuming that the surrounding magnetic field is predominantly 
dipole, the characteristic age 7. of a pulsar is given by 


P 
Te = as (1) 
(2P) 
and their surface dipole magnetic field strength (in gauss?) is 
B, © 3.2 x 1019(PP)*/? G, (2) 


Figure 1 shows the distribution of pulsars on the P — P plane, with several different 
types of pulsars indicated. For most normal pulsars, 7, is between 10° and 10’ years 
and B, is between 10! and 1013 G. For MSPs, the corresponding ranges are 10° to 
101! yrs and 10° to 101° G. 

Normal pulsars and MSPs have quite different evolutionary histories. Most if 
not all normal pulsars are formed in supernova explosions at the death of a massive 
star. They age with relatively constant B, until the pulse emission mechanism 
begins to fail when 7, reaches about 10° yrs. Many young pulsars (7 S< 10*yrs) 
are located within supernova remnants, with the most famous example being the 
Crab pulsar, PSR B0531+21, located near the center of the Crab Nebula. Most 
young pulsars lie relatively close to the Galactic Plane, consistent with the idea 
that they are formed from massive stars. MSPs on the other hand are much more 
widely distributed in the Galactic halo. They are believed to originate from old, 
slowly rotating and probably dead neutron stars that accrete matter and angular 
momentum from an evolving binary companion. This “recycling” process increases 
their spin rate so that they have periods in the millisecond range and re-energizes 
the beamed emission (see, e.g. Ref. 17). 

Figure 1 shows that the majority of MSPs are members of a binary system, 
consistent with this formation scenario. The accretion also suppresses their apparent 


4See the ATNF Pulsar Catalogue: http://www.atnf.csiro.au/research/pulsar/psrcat. 
b1 gauss (G) = 107 tesla. 
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Fig. 1. Distribution of pulsars on the P — P plane with radio-loud and radio-quiet pulsars indi- 
cated. Binary pulsars are indicated by a circle around the point and, for those with a neutron-star 
companion, the circle is filled in. Lines of constant characteristic age (T-) and surface-dipole mag- 
netic field (Bs) are shown. 


magnetic field, so that MSPs have P values about five orders of magnitude less than 
normal pulsars. Since the level of intrinsic period irregularities is related to P17 
it is this low B, that makes MSPs extremely stable clocks and suitable tools for 
the study of relativistic gravitation. Figure 1 also indicates the class of radio-quiet 
pulsars. Essentially all of the 72 known radio-quiet pulsars are young and solitary. 
They include the so-called “magnetars” which lie in the upper right side of the 
P — P diagram where dipole magnetic fields are strongest. 

Most double-neutron-star systems lie in the zone between the MSPs and the 
normal pulsars. These systems are believed to have been partially recycled prior to 
the formation of the second-born neutron star. In almost every case, the observed 
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pulsar is the recycled one since it has a much longer active lifetime than the 
newly formed young pulsar. A famous exception to this is the Double Pulsar (PSR 
J0737—3039A/B) in which the second-born star (B) is (or was) still visible.°? In 
Fig. 1, the B pulsar is the solitary double-neutron-star system on the right-hand 
side of the plot. The double-neutron-star system identified near the middle of the 
plot contains the relatively young pulsar, PSR J1906+0746. There is some doubt 
about the nature of the companion in this system — it could be a heavy white 
dwarf.'°° These double-neutron-star systems and their use as probes of relativistic 
gravity are discussed in some detail in Sec. 2.1. 


1.1. Pulsar timing 


The most important contributions of pulsars to investigations of gravitational the- 
ories and gravitational waves rely on precision pulsar timing observations. These 
allow both the relativistic perturbations to binary orbits to be studied in detail and 
the potential detection of the tiny period fluctuations generated by gravitational 
waves passing through our Galaxy. Because of the importance of pulsar timing to 
these studies, we give here a brief review of its basic principles. 

The basic observable in pulsar timing is the time of arrival (ToA) of a pulse 
at an observatory. In fact, because of signal/noise limitations and the intrinsic 
fluctuation of individual pulse shapes, ToA measurements are based on mean pulse 
profiles formed by synchronously averaging the data, typically for times between 
several minutes and an hour. The time at which a fiducial pulse phase (usually 
near the pulse peak) arrives at the telescope is determined by cross-correlating the 
observed mean pulse profile with a standard pulse template. A series of these ToAs 
is measured over many days, months, years and even decades for the pulsar of 
interest. 

These observatory ToAs are affected by the rotational and orbital motion of 
the Earth (and for satellite observatories, the orbital motion of the satellite). To 
remove these effects, the observed ToAs are referred to the solar-system barycen- 
ter (center of mass) which is assumed to be inertial (unaccelerated) with respect 
to the distant universe.° This correction makes use of a solar-system ephemeris 
giving predictions of the position of the center of the Earth with respect to the 
solar-system barycenter. Such ephemerides, for example, the Jet Propulsion Labo- 
ratory ephemeris DE 421,*° are generated by fitting relativistic models to planetary 
and spacecraft data. The correction also takes into account the relativistic varia- 
tions in terrestrial time resulting from the Earth’s motion. 

The resulting barycentric ToAs are then compared with predicted pulse ToAs 
based on a model for the pulsar. The pulsar model can have 20 or more parameters; 
generally included are the pulse frequency (v = 1/P), frequency time derivative 


©This neglects any acceleration of the solar-system barycenter resulting from, for example, Galactic 
rotation. For some precision timing experiments, such effects are taken into account at a later stage 
of the analysis. 
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(v), the pulsar position and the five Keplerian binary parameters if the pulsar is a 
member of a binary system. If the data are recorded at different radio frequencies, 
it is also necessary to include the dispersion measure (DM) which quantifies the 
frequency-dependent delay suffered by the pulses as they propagate through the 
interstellar medium. 

The differences between the observed and predicted ToAs are known as “timing 
residuals”. Errors in any of the model parameters result in systematic variations 
in the timing residuals as a function of time. For example, if the model pulse fre- 
quency is too small, the residuals will grow linearly as illustrated in Fig. 2. The 
required correction to the pulse frequency is given by the slope of this variation. 
An error in the pulsar position results in an annual sine curve which arises from 
the barycentric correction. The phase and amplitude of this curve give the correc- 
tions to the two position coordinates. Similarly, a pulsar proper motion results in 
a linearly growing sine curve (away from the reference epoch). For a sufficiently 
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Fig. 2. Variations in timing residuals for offsets in several parameters. The model pulsar has a 
pulse frequency of 300 Hz (3.3 ms period) and is in an eccentric (e = 0.4) binary orbit of period 
190 days. The reference epochs for period, position and binary phase are at the middle of the 
plotted range. 
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close pulsar, the curvature of the wavefront results in a biannual term in the resid- 
uals, and offsets in binary parameters result in terms which vary with the orbital 
periodicity. 

Since offsets in different parameters in general result in different systematic 
residual variations, it is possible to do a least-squares fit to the observed residuals 
for the correction to any desired parameter.* The precision of the fitted parameters 
is often amazingly high. For example, the relative precision of the pulse frequency 
determination is ~ 6t/T, where dt is the typical uncertainty in the ToAs and T is 
the data span. Since, for MSPs at least, dt is often < 1js and T is often many years, 
the relative precision of the measured v can easily exceed 1:10'*. Similarly, pulsar 
positions can be measured to micro-arcseconds and binary eccentricities measured 
to 1: 10°. These very high precisions often allow higher-order terms, for example, 
resulting from relativistic perturbations to the binary orbit, to be measured. 


2. Tests of Relativistic Gravity 
2.1. Tests of general relativity with double-neutron-star systems 
2.1.1. The Hulse-Taylor binary, PSR B1913+4+16 


The discovery at Arecibo Observatory in 1974 of the first-known binary pulsar, 
PSR. B1913+16, by Hulse and Taylor®! was remarkable in a number of respects. 
First, it showed that pulsars with short pulse periods but large characteristic ages 
(59ms and 10° yr, respectively, for PSR B1913+16) could exist. The period of 
PSR B1913+16 was second only to the Crab pulsar, but its age was enormously 
greater than that of other short-period pulsars known at the time. Second, it was 
in a binary orbit with a relatively massive star, very likely another neutron star, 


showing that an evolutionary pathway to such systems existed. Thirdly, its orbital 
period was extraordinarily short, only 7.75h, its eccentricity large, ~ 0.617, and, 
as shown in Fig. 3, its maximum orbital velocity very high, ~ 300kms~! or 0.1% 
of the velocity of light. As was immediately recognized by Hulse and Taylor, these 
properties opened up the possibility that relativistic perturbations to the orbit were 
potentially measurable. Lowest-order relativistic effects go as (v/c)?, and so the 
variations are of order 1:10°, enormous by the standards of pulsar measurements. 

Relativistic effects in binary motion can be expressed in terms of “post- 
Keplerian” parameters that describe departures from Keplerian motion (see, e.g. 
Ref. 129). The first such parameter to be observed was periastron precession.!®° 
In Einstein’s general theory of relativity (GR) the rate of periastron precession 
(averaged over the binary orbit) is given by: 


—5/3 
626 (>) (ToM)?/9(1 — 6), (3) 


Tv 


4The data span must be sufficiently long to avoid excessive covariance between the variations for 
different fitted parameters. 
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Fig. 3. Velocity curve for the Hulse~Taylor binary pulsar, PSR B1913+16 (Ref. 61). 


where w is the longitude of periastron measured from the ascending node, P, is the 
orbital period, Ts = GMo/c? = 4.925490947us, G is the Gravitational Constant, 
Mo is the mass of the Sun, e is the orbital eccentricity and M = m, + mg, the 
sum of the pulsar mass m, and the companion mass mz in solar units. It is worth 
noting that this is the same effect as the excess perihelion advance of Mercury that 
was used by Einstein*? as an observational verification of GR. The relativistic effect 
for Mercury is just 43 arcsec per century, minuscule compared to the 4°22 per year 
predicted and observed for PSR B1913+16. 

The next most significant parameter, normally labeled y, describes the combi- 
nation of gravitational redshift and 2nd-order (or transverse) Doppler shift, both 
of which have the same dependence on orbital phase. In GR. 


P, ue 2/3 —4/3 
vy=e . TS M mo(m1 + 2mz2). (4) 


Since the Keplerian parameters are very well determined, measurement of w and 7 
gives two equations in two unknowns, m, and mg, and so the two stellar masses can 
be determined. For PSR B1913+16, these are both close to 1.4 Mo, confirming the 
double-neutron-star nature of the system. An important consequence of this was 
that the two stars could safely be treated as point masses in GR, thereby allowing 
precise tests of the theory. 

Given the Keplerian parameters and the two masses, the next post-Keplerian 
parameter, orbital decay due to the emission of gravitational waves from the system, 
given in GR by 
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Fig. 4. Comparison of the observed and predicted orbital period decay in PSR B1913+16. The 
orbital decay is quantified by the shift in the time of periastron passage with respect to a nonde- 
caying orbit. The parabolic curve is the predicted decay from GR (Ref. 143). 


could be predicted and compared with observation. This therefore constituted a 
unique test of GR in strong-field gravity. After just four years, the P, term was well 
measured and shown to be consistent with the GR prediction.!°* The orbital decay, 
including more recent results, is illustrated in Fig. 4; the ratio of observed to pre- 
dicted orbit decay is 0.997 + 0.002.143 It should be mentioned that this near-perfect 
agreement relies on correcting the observed P, for the differential acceleration of 
the solar system and the pulsar system in the gravitational field of the Galaxy. This 
correction is —0.027 + 0.005 x 107-!? or about 1% of the observed value, and the 
uncertainty in the final result is dominated by the uncertainty in this correction. 
Unfortunately, it is unlikely that this uncertainty will be significantly reduced in 
the near future since it mainly depends on the poorly known distance to the binary 
system. 

Two more post-Keplerian parameters, denoted by r and s, relate to the Shapiro 
delay suffered by the pulsar signal while passing through the curved spacetime 
surrounding the companion star.!?° The relations for them in GR are as follows: 


r=Tpma, (6) 


ay P,\ 1/3 
eas eh - 2/3, —1 
s=sini= (=) sini (+) To '~M Pins, (7) 


where (a;/c)sini is the projected semi-major axis of the pulsar orbit (in time 
units), a Keplerian parameter. The Shapiro delay is generally only observable when 
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the orbital inclination is relatively close to 90°, that is, the orbit is seen close to 
edge-on. For the Hulse—Taylor binary pulsar the orbital inclination is about 47° and 
the Shapiro delay is small and covariant with Keplerian parameters. However it has 
been observed in a number of other neutron-star binary systems as will be discussed 
below. 

Another relativistic effect, geodetic precession, has observable consequences for 
PSR B1913+16 and several other binary pulsars. A neutron star formed in a super- 
nova explosion receives a “kick” during or shortly after the explosion which typically 
gives the star a velocity of several hundred kms~!.®® If the pulsar is a member of a 
binary system which is not disrupted by the kick, the orbital axis is changed so that 
pulsar spin axis, which was most likely aligned with the orbital axis prior to the 
explosion, is no longer aligned. Since kick velocities are often comparable to or even 
larger than the orbital velocities, the misalignment angle can be large. Precession 
of the spin axis will therefore alter the aspect of the radio beam seen by an observer 
and may even move the beam out of the line of sight. 

In GR, the precessional angular frequency is given by 


ijn 4 
G, = (#) p2/3mal my, + 3m2) 


~ 2\ on © “(—e2)M4/3 ° (8) 


For PSR B1913+16, the corresponding precessional period is about 297 years, so the 
aspect changes over observational data spans are small. Never-the-less changes in 
the relative amplitude of the two peaks in the PSR B1913+16 profile were reported 
by Weisberg et al.'44 and attributed to the effects of geodetic precession with a 
“patchy” beam. For a basically conal beam geometry, the separation of the two 
profile components would be expected to change and evidence for this was first 
found by Kramer”? leading to an estimate of the misalignment angle of about 22°. 
A data set extending to 2001 was analyzed by Weisberg and Taylor,!4 
results similar to those of Kramer.’? Their best-fitting model gives the “peanut” 
shaped beam shown in Fig. 5. Clifton and Weisberg?® have shown that a set of 
circular nested emission cones can also give apparent pulse-width variations similar 
to those observed. Over the 20-year interval covered by the data set, the “impact 
parameter” (minimum angle between the beam symmetry axis and the observer’s 
line of sight) has changed by a rather small amount, about 3°, from —3°5 to —6°5. 
Extrapolation of this model suggests that the pulsar will become unobservable in 
about 2025. While this result is compatible with relativistic precession, it is not 
possible to derive an independent measure of the precessional rate from these data. 


obtaining 


2.1.2. PSR B1534+ 12 


PSR B15344+12, discovered by Wolszczan in 1990,!°? is a binary system with 
parameters quite similar to those of B1913+16, notably a short orbital period 
(~ 10.1 hours), relatively high eccentricity (~ 0.27) and a compact orbit about 60% 
larger than that of PSR B1913+16. Analysis of less than a year’s data already 
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Fig. 5. Contours of the symmetric part of the radio emission beam for PSR. B1913+16, obtained 
by fitting to the variation in pulse width over the interval 1981 to 2001 (Ref. 145). 


allowed measurement of two post-Keplerian parameters, w and y, thereby deter- 
mining the masses of the two stars and confirming that they are both neutron 
stars. These results also showed that the orbit was more edge-on than that of 
PSR B1913+16, with an implied inclination angle of about 77°. Analysis of timing 
data extending over 22 years by Fonseca et al.*® built on earlier results by Stairs 
et al.,!! with significant detections of r and s, the Shapiro-delay parameters, and 
the orbital decay, P,, giving five post-Keplerian parameters and three independent 
tests of GR. A fourth test of GR, albeit less precise, was provided by an analysis of 
the evolution of the profile shape and polarization, yielding a rate for the geodetic 
precession of the pulsar spin axis Q, = 0°59¢}a7 yr! which is consistent with the 
value predicted by GR. Figure 6 shows the so-called “mass—mass” diagram for PSR 
B1534+12, illustrating these constraints. 

If GR gives a correct description of the post-Keplerian parameters, all of these 
constraints should be consistent with an allowed range of m; and m2. For PSR 
B1534+12, the masses are most accurately constrained by w and y, but the pre- 
dicted constraint on P, appears to be inconsistent. As for PSR B1913+16, the 
measured value of P, must be corrected for kinematic effects resulting from the dif- 
ferential acceleration of the binary and solar system in the Galaxy. PSR B1534+12 is 
much closer to the Sun that PSR B1913+16 and the so-called “Shklovskii” effect,!?° 
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Fig. 6. Plot of companion mass (m2) versus pulsar mass (m1) for PSR B1534+12. Constraints 
derived from the measured post-Keplerian parameters assuming GR are shown pairs of lines with 
separation indicating the uncertainty range. For w, y and s, the uncertainty ranges are too small 
to see on the plot. In addition, a constraint from the measured spin precession rate is shown. 
(Ref. 49). 


an apparent radial acceleration resulting from transverse motion of the binary sys- 
tem (P/ P = w?d/c, where ps is the pulsar proper motion and d is the pulsar dis- 
tance), is also important. The distance estimate is based on the pulsar DM and a 
model for the free electron density in the Galaxy and is quite uncertain. Stairs and 
her colleagues*?:!3! inverted the problem, assuming that the GR prediction for P, 
is correct, thereby deriving an improved value for the pulsar distance, a technique 
first suggested by Bell and Bailes.!? 


2.1.3. The double pulsar, PSR JO737 — 3039A/B 


The discovery of the double pulsar system?!®* heralded a remarkable era for inves- 
tigation of relativistic effects in double-neutron-star systems. In this system, the 
A pulsar was formed first and subsequently spun up to approximately its current 
period of 23ms by accretion from its evolving binary companion. The companion 
then imploded to form the B pulsar which has since spun down to its current period 
of about 2.8s. The orbital period is only 2.5h, less than a third of that for PSR 
B1913+16, and the projected semi-major axis a, sini is about 60% that of PSR 
B1913+16. These parameters imply relativistic effects much larger than those seen 
for PSR B1319+16. For example, the predicted relativistic periastron advance is 
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16°9 yr—!, more than four times the value for PSR B1913+16. Added to that, the 
orbit is seen within a degree or so of edge-on, not only allowing detailed measure- 
ment of the Shapiro delay, but also resulting in eclipses of the A pulsar emission by 
the magnetosphere of the B pulsar. Finally, the still-unique detection of the second 
neutron star as a pulsar allows a direct measurement of the mass ratio of the two 
stars from the ratio of the two nonrelativistic Roemer delays (aj sini). 

Four post-Keplerian parameters (w, y, r and s) were detected in just seven 
months of timing data from the Parkes 64-m and Lovell 76-m Telescopes.°? Further 
observations, including data from the Green Bank 100m Telescope, with a 2.5-year 
data span give the currently most stringent test of GR in strong-field conditions.”’”° 
Figure 8 shows the mass—mass diagram based on these results together with the 
measurement of geodetic spin precession for the B pulsar described below. A total of 
six post-Keplerian parameters together with the mass ratio R gives five independent 
tests of GR. As well as the three post-Keplerian parameters, w, y and P, observed 
for PSRs B1913+16 and PSR B1534+12 (Fig. 6), for the double pulsar we have 
the mass ratio R, the Shapiro delay parameters r and s and a measurement of the 
rate of geodetic precession ,,. The observed Shapiro delay (Fig. 7) shows that the 
J0737—3039A/B orbit inclination angle is 88°7 + 0°7. 
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Fig. 7. Observed Shapiro delay as a function of orbital phase for PSR JO737—3039A. Upper 
panel: Timing residuals after fitting for all parameters except the Shapiro-delay terms, r and s, 
with these set to zero. Lower panel: The full Shapiro delay obtained by taking the best-fit values 
from a full solution, but setting r and s to zero. The grey line is the prediction based on GR 
(Ref. 74). 


©The B pulsar became undetectable as a radio pulsar in 2008, most probably because the beam 
precessed out of our line of sight.!°! Because of uncertainty about the beam shape, the date of its 
return to visibility is very uncertain, but it should be before 2035. 
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As mentioned above, this nearly edge-on view of the orbit results in eclipses of 
the A-pulsar emission by the magnetosphere of the B pulsar. These eclipses last 
only about 30s, showing that the B-pulsar magnetosphere is highly modified by 
the relativistic wind from pulsar A.°? Remarkably, observations with high time 
resolution made with the Green Bank Telescope showed that the eclipse is modu- 
lated at the spin period of pulsar B.°° Modeling of the detailed eclipse profile by 
absorption in the doughnut-shaped closed-field-line region of the magnetosphere 
by Lyutikov and Thompson* allowed determination of the geometry of the binary 
system including the offset between the B-pulsar spin axis and the orbital angu- 
lar momentum axis, the so-called “misalignment angle” which they estimate to be 
about 60°. Even more remarkably, detailed measurements of the eclipse profile over 
about four years enabled Breton et al.?° to directly estimate the rate of geodetic 
precession as 4°77 + 0°66 yr~1, consistent with the predicted precessional period of 
70.95 years based on GR [Eq. (8)]. It is notable that no secular profile evolution 
has been observed for PSR JO737—3039A, implying that, unlike for the B pulsar, 
the misalignment angle for pulsar A is very small.*° 

The most precisely measured post-Keplerian parameter is w. This constraint is 
nearly orthogonal to that from the mass ratio R (Fig. 8) giving values for the two 
neutron-star masses of m1 = 1.3381 + 0.0007 Mo and mz = 1.2849 + 0.0007 Ma. 
Together with the accurately measured Keplerian parameters, these two masses can 
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Fig. 8. Plot of companion mass (mz) versus pulsar mass (m1) for PSR J0737—3039 with observed 
constraints interpreted in the framework of GR. The inset shows the central region at an expanded 
scale, illustrating that GR is consistent with all constraints (M. Kramer, private communication). 
(For color version, see page I-CP7.) 


J-420 R. N. Manchester 


be used to predict the remaining five post-Keplerian parameters using Eqs. (4)—(8). 
As Fig. 8 shows, GR. gives a self-consistent description of the orbital motions, with 
all five independent constraints consistent with the masses derived from w and R. 

The most stringent test comes from the derived value of s, 0.99974 + 0.00039. 
The ratio of the observed and predicted values of s is 0.99987 + 0.0005, by far the 
most constraining test of GR in the strong-field regime. Furthermore, this test is 
qualitatively different to that for PSR B1913+16 in that it is based on a nonra- 
diative prediction. With just 2.5 years of data analyzed, the orbital decay term for 
the double pulsar is not measured as precisely as that for PSR B1913+16. How- 
ever, since the phase offset grows quadratically (Fig. 4) and the effect of receiver 
noise decreases as T!/? where T is the data span (assuming approximately uni- 
form sampling), the precision of the P, measurement should improve as T°/?, 
Furthermore, JO737—3039A/B is much closer to the Sun than PSR B1913+16, 
thus reducing the magnitude and uncertainty of the correction due to differential 
Galactic acceleration. Also, using very long-baseline interferometry (VLBI), Deller 
et al.?° have shown that the transverse space velocity of the binary system is small, 
just a kms~!, and so the Shklovskii correction to the observed P, is also small 
and accurately known. For these reasons, the orbit-decay test with the Double Pul- 
sar will not be limited by these corrections, at least for another decade or so. Beyond 
that, improved models for the Galactic gravitational potential can be expected from 
analysis of GAIA data!°® and improved parallax and proper motion measurements 
may come from further VLBI observations, both reducing the uncertainty in these 
kinematic corrections. 


Several other post-Keplerian parameters can in principle be observed in binary 
systems,*° testing other aspects of relativity. Kramer and Wex” show that only 
one of these, 6g, which quantifies relativistic deformations of the binary orbit, is 
potentially measurable with the double pulsar system. Even this will take more 
than a decade to reach an interesting level of precision given reasonable expectations 
about future observations. 

A potentially exciting prospect is to use measurement of higher-order terms for 
relativistic periastron precession to put constraints on the moment of inertia of a 
neutron star.?? At the second post-Newtonian level, the periastron precession has 
three components: 


w= WipN “+ W2pN = WSO; (9) 


where Wipn is given by Eq. (3), Wapn is the 2nd post-Newtonian (v/c)* contri- 
bution and wgo gives the contribution from spin-orbit coupling.!®? This latter 
term depends on the moment of inertia and spin of the A pulsar (the spin angular 
momentum of the B pulsar is negligible) and can in principle be measured if the 
two leading components in Eq. (9) can be measured to sufficient precision. Kramer 
and Wex” show that, given expected advances in precision timing and astrometry, 
a significant result could be obtained in 20 years or so. 
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Spin-orbit coupling can also lead to a nonlinear variation in w as a function of 
time and a time variation in the projected semi-major axis of pulsar A, a, sini.” 
However, the nonlinear terms depend on sin@, where @ is the misalignment angle. 
As mentioned above, it seems as though 6,4 is very small, so unfortunately these 
terms may be difficult to detect. 


2.1.4. Measured post-Keplerian parameters 


Table 1 summarizes the measured post-Keplerian parameters for the eleven pulsar 
binary systems where three or more post-Keplerian parameters have been mea- 
sured. Since only two measurements are required to determine the stellar masses, 
these systems provide the opportunity for tests of theories of relativistic gravity. Of 
the eleven systems, in six cases the companion star is believed to be a neutron star, 
in a further four cases a white dwarf companion is more likely and in one case the 
companion is probably a main sequence star. This latter system, PSR J1903+-0327, 
evidently has had an unusual evolutionary history as a triple system in which one 
component was ejected.°° It is currently not clear if there is a significant contri- 
bution to the observed w from kinematic effects and/or spin-orbit coupling, so the 
utility of this system for tests of relativistic gravity is limited. There a further ten 
pulsar systems in the literature where just two post-Keplerian parameters have 
been measured, enabling estimates of the stellar masses, but no tests of relativistic 
gravity. 

Although an independent measurement of relativistic spin precession has been 
possible for just two systems so far, as shown in Table 1, secular changes in observed 
pulse profiles have been observed for four other pulsars and attributed to geodetic 
precession of the pulsar spin axis. 


2.2. Tests of equivalence principles and alternative theories 
of gravitation 


Pulsars and especially binary pulsars have unique advantages in testing theories of 
relativistic gravitation as a result of their often rapid spin, short orbital periods 
and the ultra-high density of the under-lying neutron stars. As we have shown 
above, GR has been amazingly successful in describing all measurements to date. 
Never-the-less, investigations of quantum gravity and cosmology suggest that, in 
some regimes, extensions or modifications of GR may be required. This strongly 
motivates a search for departures from GR within existing experimental capabilities. 

Gravitational theories have equivalence principles at their heart. The weak 
equivalence principle (WEP) is basic to Newtonian gravity, stating that acceler- 
ation in a gravitational field is independent of mass or composition. The strong 
equivalence principle (SEP) adds Lorentz invariance (no preferred reference frame) 
and position invariance (no preferred location) for both gravitational and non- 
gravitational interactions. GR satisfies the SEP whereas other theories may violate 
the SEP or even the WEP in one or more respects. 


Table 1. Binary pulsars with three or more significant measured post-Keplerian parameters. 
Pulsar /Parameter J0437—4715 = JO737—3039A/B  JO7514+1807 J1141-6545 B1534+12 J1756—2251 
Peri. advance Ww (°yr~!) 0.016(8) 16.8995(7) = 5.3096(4) 1.7557950(19) 2.58240(4) 
Time dilation + (ms) — 0.3856(26) — 0.773(11) 2.0708(5) 0.001148(9) 
Orb.P deriv. P, (10712) 3.73(6)> —1.252(17) —0.031(5) —0.403(25) —0.1366(5) —0.229(5) 
s=sini 0.6746(28) 0.99974(39) 0.90(5) = 0.9772(16) 0.93(4) 
Comp. mass mz (Mo) 0.254(18) 1.2489(7) 0.191(15) _ 1.35(5) 1.6(6) 
Geod. prec. 2p (°yr~!) — 4.77(66)° _— Note d 0.59(10) — 
Binary companion* He WD NS He WD co WD NS NS 
References 141 74, 20 96, 97 15 AQ AG 
Pulsar/ Parameter J1807—2459B J1903+0327 J1906+0746 B1913+16 B2127+11C 
Peri. advance Ww (°yr~!) —_-(0.018339(4) 0.0002400(2) 7.5841(5) 4.226598(5) 4.4644(1) 
Time dilation y (ms) 26(14) — 0.470(5) 4.2992(8) 4.78(4) 
Orb.P deriv. P;, (107!) —0.56(3) —2.423(1)P —3.96(5) 
s=sini 0.99715(20) 0.9759(16) = = = 
Comp. mass m2 (Mo) 1.02(17) 1.03(3) 
Geod. prec. Q, (°yr—+) Note d Note d Note d 
Binary companion®* CO WD(?) MS NS(?) NS NS 
References 82 50 139, 41 145, 143 62 


Note: *Binary companion types: CO WD: Carbon-Oxygen White Dwarf; He WD: Helium White Dwarf; MS: Main-sequence 


star; NS: Neutron star. 


>Dominated or significantly biased by kinematic effects. 


°For PSR. JO737—3039B. 
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Comparison of different gravitational theories has been greatly facilitated by 
the “parametrized post-Newtonian” (PPN) formalism which describes observable 
or potentially observable phenomena in a theory-independent way. This formalism 
was first developed by Will and Nordtvedt!° for “weak-field” situations, that is, 
where € ~ GM/Rc? « 1, where G is the Newtonian gravitational constant, M and 
R characterize the size and mass of the object and c is the velocity of light. Many 
gravitational tests are performed within the solar system where e€ S$ 1075, firmly 
in the weak-field regime. However, in the vicinity of a neutron star € ~ 0.2 and so 
strong-field effects are potentially important. A number of authors have considered 
the generalization of the PPN formalism to strong-field situations (see, e.g. Refs. 31, 
148 and 35) allowing investigation of these effects. 

Some Lorentz-violating theories predict a dependence of the velocity of light on 
photon energy or polarization.!4° Pulsar observations can be used to limit these the- 
ories, but potentially stronger limits come from observations of gamma-ray bursts, 
polarized extra-galactic radio sources and the cosmic microwave background.%° 

A recent comprehensive review of observational limits on theories of gravitation 
by Will can be found in Ref. 149. Pulsar tests of relativistic gravitation have been 
reviewed by Stairs!?9 and, more recently, by Wex.!4° Further details on many of 
the topics discussed here may be found in these reviews. 


2.2.1. Limits on PPN parameters 


The standard PPN formalism has ten parameters: yppn, Sppn, €, 1, Q2, a3, C1, 
G2, ¢3 and ¢4. In GR yppn, describing space curvature per unit mass, and Gppn, 
describing superposition of gravitational fields, are unity and all others are zero. € 
describes preferred location effects, the a, preferred frame effects and the others 
describe violations of momentum conservation. (a3 also may be nonzero in this 
case.) Pulsar observations do not directly constrain yppy or Gppn but are important 
in constraining many of the remaining PPN parameters and in fact currently place 
the strongest constraints on several parameters. 

Damour and Schiafer** recognized that wide-orbit low-eccentricity binary pul- 
sars, which generally have a white-dwarf companion, could provide a valuable test 
of the SEP through a strong-field extension of the solar-system tests pioneered by 
Nordtvedt.°® A violation of the SEP would cause bodies with different gravitational 
self-energy to fall at different rates in an external gravitational field. In a pulsar— 
white-dwarf binary system, this results in a forced eccentricity in the direction of 
the gravitational field, that of the Galaxy in this case. This eccentricity is given by 


(10) 


where g, is the projection of the Galactic gravitational field on to the orbital plane, 
w is the relativistic periastron advance, a is the orbit semi-major axis and P, the 
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orbital period. The ratio of the gravitational mass m, and the inertial mass m 
(which is exactly one if the SEP is obeyed) for body 7 is described by 


2 
me) = 1 Aves $3 ae ee ae (ee ee is 
(“ 5 ves ae me ae me 4 (11) 


a 


where 7y is the Nordtvedt parameter, a function of several PPN parameters (see, 
e.g. Ref. 149) and A = A, — Ag. Violations of the SEP will result in nonzero A. 

Since Damour and Schafer?? first proposed this method of testing SEP vio- 
lations, the number of suitable pulsar—white-dwarf binary systems has increased 
greatly. Gonzalez et al.°? combined data for 27 systems to place a 95% confidence 
upper limit on |A| of 4.6 x 1073.f Since E,/mc? ~ 0.1 for a neutron star, this limit is 
not as strong as the weak-field limit on ny < 3 x 107+ from lunar-laser ranging.'>4 
However, it does enter the strong-field regime and test possible violations of the 
SEP that solar-system tests cannot reach. 

The recent discovery by Ransom et al.!%4 
ing a pulsar, PSR J0337+1715, and two white-dwarf stars in essentially coplanar 
orbits, one in a relatively tight 1.6-day orbit with the pulsar and the other in a 
much wider 327-day orbit around the inner system, opens up the possibility of 
a much more sensitive test of the SEP. Precise timing observations of the pulsar 
have already shown that the motion of the inner system is strongly affected by 
the gravitational field of the outer white dwarf. The gravitational field of this 
star, which has a mass of about 0.41Mo, at the inner system is at least six 
orders of magnitude larger than the Galactic gravitational field at the position 
of a typical pulsar and so the effect of SEP violations may be expected to be corre- 
spondingly larger.°! Observations over several orbital periods of the outer star will 
almost certainly be necessary to isolate any SEP-related effects from other orbit 
perturbations. 

The wide-orbit low-eccentricity binary pulsars can also be used test for violations 
of local Lorentz invariance (LLI) of the gravitational interaction and momen- 
tum conservation that are described by the parameter a3 and its strong-field 
generalization 43. Bell and Damour!? showed that such violations produce a forced 
eccentricity analogous to that produced by the Nordtvedt effect given by 


of a remarkable triple system contain- 


Cp|w| P; e 


247 P G(m, + m2 


ae (12) 


where c, is a dimensionless “compactness” parameter, for neutron stars about 0.2, 
and @ is the angle between w, the velocity of the system with respect to a reference 
frame defined (for example) by the cosmic microwave background, and the pulsar’s 
spin axis. A similar analysis to that for the generalized Nordtvedt effect resulted 
in a 95% confidence limit of |@3| < 4.0 x 10~?°, some 13 orders of magnitude lower 
than the best solar-system test.!°° It is worth noting that the observed small scatter 


fThis limit may be slightly under-estimated — see Wex.!46 
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in the period derivatives of MSPs had already been used by Bell!! to limit |a3| to 
ne 1. 

Strong-field limits on the other two PPN parameters describing preferred frame 
effects, specifically LLI, @; and G2, can also be obtained from low-eccentricity binary 
pulsar observations.*! Nonzero & induces a forced eccentricity in the direction of 
motion of the binary system velocity w, analogous to Nordtvedt @3 tests discussed 
above, whereas nonzero @2 induces a precession of the orbital angular momen- 
tum about w. The best current limits come from observations of the binary pulsar 
systems PSRs J1012+5307 and J1738+0333, both of which have short orbital peri- 
ods, ~ 0.60 and ~ 0.35 days respectively, and extremely low orbital eccentricities 
e S$ 10-'.!*! Furthermore, both of these pulsars have optically identified binary 
companions which, together with proper motion measurements from the timing 
observations, allow the three-dimensional space velocity of the binary system to 
be determined. For these pulsars, the observational data span is sufficiently long 
(Z 10 years) that relativistic periastron advance has significantly changed the ori- 
entation of the intrinsic eccentricity vector relative to the direction of any forced 
eccentricity, resulting in a potentially detectable change in the total eccentricity. 
The strongest limit on @, comes from observations of PSR J1738+0333 as shown in 
Fig. 9, conservatively 4; < 4x 10~°. This is not only better than the best weak-field 
limit of ay < 2 x 1074 from lunar laser ranging,®* but also constrains strong-field 
effects as well. 
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Fig. 9. Constraints on the strong-field PPN parameter G; from timing observations of the low- 
eccentricity binary pulsar PSR J1738+0333. The constraints are a function of the unknown ori- 
entation of the binary system on the sky (described by the longitude of the ascending node Q) 
and the unknown sign of sin. If the PPN parameter @2 is assumed to be zero, then only certain 
ranges of 2 are permitted. The shading corresponds to 68%, 95% and 99.7% confidence limits on 
G1 (Ref. 121). 
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A precession of the orbital axis about the system velocity vector w induced 
by a nonzero value of G2 is potentially observable as a time variation « in the 
projected semi-major axis of the pulsar orbit « = a,sin?. « is one of the possible 
post-Keplerian parameters in standard timing solutions and significant values have 
been measured for both PSRs J1012+5307 and J1738+0333.!2! There are several 
possible contributions to the observed « but in Ref. 121 it is shown that all of these 
are negligible in these systems except that due to the changing orbit inclination 7 
resulting from proper motion of the system. This is a function of the unknown 2 
values and for certain 2 values there is no constraint. Consequently only a prob- 
abilistic limit on @2 can be obtained. By combining results for the two pulsars, a 
95% confidence limit of |G2| < 1.8 x 10~4 is obtained. 

This is not as constraining as a solar-system limit |ag| < 2.4x10~7 obtained from 
the present deviation of the Sun’s spin axis from the solar-system orbit normal,®? 
a limit that rests on the assumption that the two axes were aligned at the time of 
formation of the solar system. 

However, pulsars provide an even stronger constraint based on the stability of 
the spin axis of isolated pulsars. Any precession of the spin axis of a pulsar is 
likely to result in changes in the observed pulse profile (as observed with geodetic 
precession in binary pulsars as discussed in Sec. 2). Shao et al.1?° compared mean 
pulse profiles for the isolated MSPs B1937+21 and J1744—1134 taken 10-12 years 
apart with the same observing system and found no perceptible change in the pulse 
width at 50% of the peak amplitude. They interpreted these results by assuming a 
circular beam profile for the main pulse in PSR B1937+21 and for PSR J1744—1134. 
All of the angles in the problem can be estimated from modeling of radio and 
gamma-ray observations of the pulsar, taken together with known direction of the 
system velocity w with respect to the cosmic microwave background, except the 
angle of the projected spin axis on the sky. Probability histograms for @ allowing 
for this unknown angle are shown in Fig. 10. The final result for the 95% confidence 
upper limit is |@2| < 1.6x10~°, about four orders of magnitude better than the limit 
described above based on orbital precession in pulsar binary systems and two orders 
of magnitude better than the limit based on solar spin precession. The assumption 
of circular beams in these MSPs is problematic, since there is good evidence for 
caustic enhancement in MSP pulse profiles, both radio and gamma-ray,°*!> which 
would tend to elongate the beam in the latitude direction, reducing the effect of 
precession on the observed pulse profiles. However, taking this into account would 
probably increase the limit by a factor less than ten, and so this limit would remain 
the best available. 

The stable pulse profiles of PSRs B1937+21 and J1744—1134 have also been 
used to limit the PPN parameter € describing local position invariance (LPI), also 
known as the Whitehead parameter, and its strong-field counterpart € (Ref. 122). 
The centripetal acceleration of Galactic rotation results in an anisotropy in the 
local gravitational field at a pulsar resulting in a precession of the pulsar spin 
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Fig. 10. Constraints on the strong-field PPN parameter G2 based on the long-term stability of 
the mean pulse profiles for the isolated MSPs B1937+21 and J1744—1134 (Ref. 120). 


vector around the direction of the Galactic acceleration with period 


 =€ (F) (“2)" cos, (13) 


where vg is the velocity of Galactic rotation at the pulsar and 7 is the (unknown) 
angle between the pulsar spin vector and the Galactic acceleration. Combining the 
results for the two pulsars in a way analogous to that for the a2 test described 
above, Shao and Wex!2? obtain a 95% confidence limit |€| < 3.9 x 107°. Even 
with the qualification about beam shapes mentioned above, this limit is at least 
two orders of magnitude better than the next best limit obtained from considering 
of the evolution of the solar spin-misalignment angle over the lifetime of the solar 
system. 1!?? 

The PPN parameter ¢ is one of a number of parameters that may be nonzero in 
gravitational theories that violate conservation of total momentum.!4? A nonzero C2 
would result in an acceleration of the center of mass of a binary system that changes 
direction with periastron precession. This is best measured by looking for a change 
in the rate of orbital decay in a binary system with large periastron precession and 
a long data span. Will'4” used observations of PSR B1913+16 with a 15-year data 
span to obtain a limit |¢.| < 4 x 10~°. Obviously, this test could be made more 
stringent using the long data spans now available for PSR B1913+16 and other 
double-neutron-star systems with high w. 


2.2.2. Dipolar gravitational waves and the constancy of G 


As described above, because of its tiny eccentricity, the pulsar-white-dwarf binary 
system PSR J1738+0333 has played a significant role in limiting the PPN parame- 
ters describing preferred frame effects. This system is composed of two very different 
stars, the neutron star we see as a pulsar and the companion which we know to 
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be a white dwarf of mass 0.181 Mo through its optical identification.” This and its 
short orbital period (~ 8.5h) make it an ideal system for testing theories of gravity 
that predict a dipolar component to gravitational-wave damping as well as general 
scalar-tensor theories.°? 

From analysis of about 10 years of timing data obtained at Parkes and Arecibo 
observatories, Freire et al.°? found an observed rate of orbital decay for PSR 
J1738+0333 of P, = (—17.0+3.1) x 107). To obtain the intrinsic rate of orbital 
decay, kinematic contributions from the differential acceleration of the binary sys- 
tem and the solar system in the Galactic gravitational field and the Shklovskii effect 
due to transverse motion must be subtracted, giving Pl™ = (—25.9+3.2) x 107}. 
Since the orbital parameters and the masses of the two stars are well known, the 
orbit decay due to GR can be accurately determined: PSR = (—27.7113) x 107}, 
leaving a residual orbit decay of Pes = (2.0137) x 107). 

This residual orbit decay is consistent with zero, which can be interpreted as 
a further confirmation of the accuracy of GR. However, because of the very dif- 
ferent nature of the two stars in this binary system, this result also places strong 
constraints on theories of gravity that predict a dipolar component to gravitational- 
wave emission. Besides a dipolar component PP , there are several other possible 
contributions to P}*: 


Bt = PM PE 4 PP Pe (14) 


where pM is due to mass loss from the binary system, pr is a term resulting from 
tidal effects on the white dwarf (tidal effects on the neutron star are negligible) and 
Pe is decay resulting from a possible variation in the gravitational “constant” G.34 
Freire et al.°? showed that the likely M and tidal terms are small for this system, 

$ 10-15, and so the limit on P!" is effectively a limit on PP + PS. 
Within certain restrictions on strong-field effects,°” the dipole term is given by 
An? 


: q 
PP = Tome 
b Beg 


KpS” + O(s"), (15) 


where g = m1/mzg is the mass ratio (with subscript 1 referring to the neutron star), 
S = 81 — Sq is the difference in “sensitivity” s of the mass of each body to a scalar 


field ¢, where 
_ {dilnm(¢) 
i (Sz), a) 


and kp is a body-independent constant that describes the dipole self-gravity contri- 
bution in a given theory of gravity (see, e.g. Ref. 149). The sensitivity s; depends on 
the stellar equation of state and, for neutron stars, is typically about 0.15, whereas 
for a white dwarf it is ~ 107+. Therefore, if Kp is nonzero, dipole radiation will 
contribute to the orbit decay. 

The remaining term in the residual orbit decay is that due to possible variations 
in G. In weak gravity, G/G has been constrained to be less than 4 x 10718 yr7! 
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from lunar laser ranging experiments,° giving 


Ms 0A F £08 % 10°. (17) 
and hence |kp| < 2 x 10-4. However, it is also possible to obtain independent esti- 
mates of the effects of dipole radiation and G by combining the J1738+0333 results 
with those for PSR J0437—4715.°° This southern pulsar is in a wider orbit than PSR 
J1738+0333 and hence has a different mix of the dipole and G components, allow- 
ing them to be separated. After accounting for the fact that a changing G will also 
change the stellar masses,!°° the formal results are G/G = (—0.6+1.6) x 1071? yr7! 
and Kp = (—0.3 + 2.0) x 1074, both effectively upper limits. While the limit on 
Kp is the best available, the derived limit on G/G is about an order of magnitude 
weaker than the result (actually a limit on the variation of GM.) from the Mars 
Reconnaissance Orbiter’! and from lunar laser ranging.©? 

Interestingly, pulsars have provided two other independent limits on G /G. 


Thorsett+?° used determinations of neutron star masses from timing observations of 
double-neutron-star systems that formed many gigayears ago. In standard forma- 
tion scenarios, the mass of a neutron star depends on the Chandrasekhar mass, the 
maximum possible mass of a white dwarf star, just prior to the collapse to a neutron 
star. The Chandrasekhar mass is proportional to G~*/? and so the observed small 
range of neutron star masses implies that CG /G < 4x 10-'. This limit has been 
somewhat weakened by recent discoveries of both less massive and more massive 
neutron stars in pulsar binary systems (see Ref. 70 for a recent review). 

The very small observed rate of change of pulsar period P observed in some 
pulsars (after correction for kinematic effects) may be used to set a further inde- 
pendent limit.4°? A variation in G will result in an inverse variation in the stellar 
moment of inertia, with the exact relation depending on the neutron-star struc- 
ture. If the observed (intrinsic) P is entirely attributed to this effect, a limit of 
G/G S$ 2x 107" is obtained. 


2.2.3. General scalar—tensor and scalar—vector—tensor theories 


Many alternate theories of gravity can be expressed in a “tensor—scalar” framework 
in which a scalar field ¢ contributes to the “physical metric” g,,, through a coupling 
function A(@) 


uv = A? (?) our, (18) 


where g,,, is the usual tensor metric. The coupling constant may be expressed in 
different ways,!4° one of which is as an expansion around the asymptotic value of 
the scalar field do 


In A(¢) = In A(¢0) + a0(¢ — 40) + Bo(¢ — 0)? +-++ (19) 
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(Ref. 31). For GR, a9 = Go = 0. In the well-known example of a tensor—scalar 
theory, the Brans—Dicke theory, the scalar coupling is described by a single param- 
eter wep. For wep — oo, the Brans—Dicke theory approaches GR. For this theory, 
ag = 1/(2wgp + 3) and fo = 0. In other theories, both the linear and quadratic 
terms in Eq. (19) (and higher-order terms) may be nonzero. 

The various observational constraints on PPN and post-Keplerian parameters 
can be expressed as limits in the (a0,(o) space for tensor-scalar theories as shown 
in Fig. 11.°? The Cassini experiment‘ placed a strong limit on the PPN parameter 
yppn = 14+(2.1+2.3) x 107° which translates to a limit on |ao| of about 0.003. Only 
the limits on dipole gravitational radiation from the asymmetric binary systems 
PSR J1141—6545!° and PSR J1738+0333°? rival the Cassini limit over most of 
the space. Since the precision of these measurements will increase with time, it 
seems likely that ultimately binary systems such as these will provide the strongest 
constraints on tensor-scalar theories. 

PSR J0348+0432 and its white-dwarf companion form another asymmetric 
binary system, one that is distinguished by its very short orbital period (2.46 h) and 


ol O— 
aN 


LLR 


B1534+12 


SEP 
J0737-3039 


B1913+16 


eee LLR 


J1141-6545 
J1738+0333 


Cassini 


> Bo 


6 4 2 0 2 4 & Cee 


Fig. 11. Constraints on the scalar-field parameters in the (|ao|,G0) plane from various observa- 
tional tests. Only the region below each line is allowed by the corresponding test. “SEP” refers to 
the test of the strong equivalence principle based on low-eccentricity pulsar—white-dwarf binary 
systems,°? “LLR” refers to lunar laser ranging results!®! and Cassini to the “Shapiro delay” 
experienced by signals to and from the Cassini spacecraft (on its way to Saturn) as its line of 


sight passed close to the Sun.!4 Other binary-system tests are labeled according to the pulsar 
concerned. In GR, both ag and { are zero (Ref. 52). 
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massive neutron star (2.01 + 0.04Mj).° The high neutron-star mass is of interest 
since some scalar—tensor theories predict a strongly nonlinear relationship between 
the strength of dipole GW emission and neutron-star mass or self-gravity [Eq. (15)]. 
For most of the parameter space of the class of theories discussed by Freire et al.,°? 
the limits on effective scalar coupling from PSR J0348+0432 are currently not as 
strong as those from PSR J1738+0333 but, because of the high neutron-star mass, 
they place stronger limits on some other theories with a greater degree of nonlinear 
coupling.146 

Bekenstein!® has proposed a relativistic generalization of the so-called “MOND” 
theory of gravity®! that seeks to avoid the need for dark matter in galactic dynamics. 
The generalization invokes an additional vector field and hence is known as a tensor— 
vector—scalar theory. Such theories relax some of the constraints on tensor—scalar 
theories, in particular, the dipole radiation constraints, and allow significantly larger 
values of ag. However, as shown by Freire et al.,°? the binary pulsar results still 
significantly constrain theories of this type and in fact are more constraining than 
solar-system tests. With future observations, binary pulsar tests have the potential 
to make this class of theories untenable. 

In another example of the use of pulsar observations, especially the limits 
on dipolar-GW radiation, to place limits on gravitational theories, Yagi et al.!°° 
have strongly limited the allowed parameter space for the LLI-violating “Einstein— 
ther” and the “Khronometric” theories. 


2.3. Future prospects 


As described above, GR has provided an accurate description of all pulsar tim- 
ing results obtained so far. However, continued refinement of existing methods and 
development of new tests is highly desirable. Continued pulsar timing measure- 
ments, especially with the advent of new and more sensitive observing facilities 
such as the 500-m Arecibo-type FAST radio telescope in China®* and the Square 
Kilometre Array (SKA) in South Africa and Australia?* will certainly improve on 
existing limits and enable new tests of gravitational theories. They may even demon- 
strate a failure of GR and hence a need for a modified or conceptually different 
theory of gravity. Conversely, if GR is assumed to be valid, results of astrophysical 
significance can be deduced from the observations. For example, as described in 
Sec. 2.1.3, observations of higher-order terms in the relativistic perturbations may 
enable a measurement of the moment of inertia of a neutron star. 

These more sensitive radio telescopes can also be used to search for previously 
unknown pulsars and binary systems that are suitable for tests of gravitational 
theories. Past experience has shown that pulsar searches repeatedly turn up new 
classes of object. This potential is wonderfully illustrated by the recent discovery of 
the pulsar triple system PSR J0337+1715 which promises to provide a strong limit 
on violations of the strong equivalence principle. A dream for such searches is the 
discovery of a pulsar in a close orbit around a black hole as this would offer much 
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more stringent tests of gravity in the strong-field regime.®° Discovery of a pulsar 
orbiting the black hole at the center of our Galaxy with an orbital period of a few 
months or less could even allow a test of the so-called “no-hair” theorem for black 
holes.81 


3. The Quest for Gravitational-Wave Detection 


The direct detection of the gravitational waves (GWs) predicted by Einstein’s gen- 
eral theory of relativity and other relativistic theories of gravity is one of the major 
goals of current astrophysics. As described in Sec. 2, we have excellent evidence 
from the orbital decay of binary systems for the existence of GWs at the level pre- 
dicted by GR, but up to now there has been no direct detection of the changing 
curvature of spacetime induced by a passing GW. This changing curvature induces 
a change in the proper distance between two test masses, described by the gravi- 
tational strain h = 6L/L. The problem is that, for any likely source, h is tiny. For 
example, the LIGO gravitational-wave detector? hopes to detect the merger of two 
neutron stars at a distance of 100 Mpc for which h ~ 10~??, a change in the length 
of its 4km arms of 107!8 m or 1073 of the diameter of a proton. 

Pulsar timing can measure a change in the proper distance between the pul- 
sar and the telescope. Systematic changes in timing residuals for a given pulsar 
reflect unmodeled changes in the effective time of emission, the pulsar position, the 
propagation path or the position of the telescope. With care, and with observa- 
tions of multiple pulsars, residual delays due to changing proper distances can be 
isolated, effectively giving us a set of interferometers with baselines of 2 10!® km. 
However, even in the best cases, we can only measure the interferometer “phase” 
to about 100ns, so that the limiting strain is about 10~'§. Unlike ground-based 
laser-interferometer systems, which are most sensitive to GW signals with frequen- 
cies around 100Hz, pulsar timing systems are most sensitive to signals with fre- 
quencies around the inverse of the data span, typically a few nanohertz. Potential 
sources of detectable GWs in this low-frequency band include super-massive black- 
hole (SMBH) binary systems in distant galaxies and cosmic strings in the early 
universe. 


3.1. Pulsar timing arrays 


The effect. of GWs on pulsar timing signals was first considered by Sazhin'!? and 
Detweiler,#? with the latter being the first to consider the effect of a GW from a 
distant source passing over a pulsar and the Earth. For this case, it can be shown 
that the net effect on the observed pulsar arrival times is simply the difference 
between the effect of the GW passing over the pulsar and the effect of the GW 
passing over the Earth. For a GW travelling in the 2-direction, the redshift z of 
the pulse frequency v for a pulsar at distance d with direction cosines (a, 3, y) is 
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given by 


yy —v{t) a? —p? af 

Z6)= = Ah, 4 Ahx, 20 

(t) Uo ai+y) ~ ity * a0) 

where Ahy = h?, — h4, with A representing the two possible wave polarization 
states (+,x) in GR, and h(t — d/c) and h4(t) the gravitational strain at the 
pulsar and the Earth, respectively. The observed timing residuals are then given 


by the integral of the redshift, 
t 
R(t) = | 2(t!)dt. (21) 
0 


The coefficients multiplying the Ah, and Ah, terms are the “antenna patterns” 
for the two polarizations as illustrated in Fig. 12. A pulsar located in the GW 
propagation direction has zero sensitivity to the GW since its direction cosines @ 
and @ are zero. Furthermore, despite the (1+) term in the denominator of Eq. (20), 
the response for a pulsar exactly in the —z direction (the same direction as the GW 
source) is also zero. This comes about because of a cancellation of the (1+ y) term 
by the expansion of Ah4 when 1+7 <1 (Ref. 78). 

A pulsar timing array (PTA) consists of a set of pulsars spread across the sky 
which have precise timing measurements over a long data span. The detection of 


04 


Fig. 12. Effective “antenna pattern” for detection of a GW with pulsar timing. The wave is 
propagating in the +z direction and is assumed to have the + polarization. The pattern for the x 


polarization is the same but rotated by 45° about the z-axis (Ref. 25). 


See Ref. 5 for a rederivation of the Detweiler result. 
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GWs by PTAs depends on the correlated timing residuals for different pulsars given 
by the Earth term h(t) in Eq. (20). GWs passing over the pulsars produce uncorre- 
lated residuals because of both the retarded time and the different GW environment 
for each pulsar. Also the pulsars themselves have uncorrelated timing noise at some 
level, either intrinsic or resulting from uncorrected variations in interstellar delays. 
Because the expected GW strain is so weak, only MSPs have sufficient timing pre- 
cision to make GW detection with PTAs feasible. 

For an isolated source of continuous GWs, say an SMBH binary system in a 
nearby galaxy, in principle, both the Earth term and the pulsar term in Eq. (20) 
could be detected. For a rapidly evolving source, the pulsar term and the Earth 
term may be at different frequencies because of the retarded time of the pulsar term 
(see, e.g. Ref. 65). They can then be added incoherently to increase the detection 
sensitivity. For a nonevolving source, i.e. Of, < 1/T where df, is the change in 
binary orbital frequency over the pulsar timing data span 7, in principle the pulsar 
term and the Earth term could be summed coherently for optimal sensitivity. As 
is discussed further in Sec. 3.4, unfortunately we currently do not know enough 
pulsar distances to sufficient accuracy to make this coherent addition possible. In 
a PTA, the pulsar terms therefore add with random phase, washing out the fringes 
in the antenna pattern (see also Ref. 78) and adding “self-noise” to the signal from 
the Earth term. Since the antenna pattern (Fig. 12) has a maximum for pulsars 
roughly in the same direction as the GW source, the maximum response of a PTA 
is toward the greatest concentration of pulsars in the array. 

A stochastic background of nanohertz GW from many SMBH binary systems 
in distant galaxies is likely to be the signal first detected by PTAs. To a first 
approximation, this background is also likely to be statistically isotropic, i.e. the 
expectation value (h?) is independent of direction when averaged over typical data 
spans. Hellings and Downs”° were the first to show that, in this case, the correlation 
between GW-induced timing residuals for two pulsars separated by an angle @ on 
the sky is dependent only on 6 and not on the sky positions of the two pulsars. The 
zero-lag correlation function, commonly known as the Hellings and Downs curve 
and obtained by integrating the product of the antenna patterns (Fig. 12) for the 
two pulsars over all possible GW propagation directions, is given by: 


1 32 1 
= t l ; 22 
CHD ; 5 ( nz =) (22) 


where x = (1 — cos@)/2, and is plotted in Fig. 13. cyp goes negative for angular 
separations around 90° and then positive again for pulsars that are more-or-less 
opposite on the sky — this is a direct consequence of the quadrupolar nature of 
GWs. It is also important to note that the limiting value as @ — 0 is 0.5, not 1.0. 
This is a consequence of the fact that the pulsar terms in Eq. (20) are uncorrelated 
and, on average, of equal amplitude to the Earth term. The scatter in the simulated 
correlations results from the random phases of the pulsar terms and illustrates the 
“self-noise” that limits the sensitivity of PTA experiments in the strong-signal limit. 
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Fig. 13. The Hellings and Downs correlation function, i.e. the correlation between timing resid- 
uals for pairs of pulsars as a function of their angular separation for an isotropic stochastic back- 
ground of GWs. Also shown are simulated correlations between the 20 pulsars of the Parkes 
PTA for a single realization of a strong GW signal that dominates all other noise contributions 
(Ref. 58). 


3.2. Nanohertz gravitational-wave sources 
3.2.1. Massive black-hole binary systems 


There is good evidence that massive black holes form in the center of galaxies at 
very early times (see, e.g. Ref. 140) and also that merger events play a major role 
in galaxy growth (see, e.g. Ref. 132). When two galaxies, each containing a cen- 
tral massive black hole, merge, dynamical friction will result in the two black holes 
migrating to the center of the merged galaxy to form a binary system, with an 
estimated timescale for the migration of the order of giga-years (see, e.g. Ref. 68). 
When the binary separation is less than about 1 pc, loss of energy to GWs becomes 
the dominant orbital decay mechanism and the binary system will ultimately coa- 
lesce, becoming a strong GW source as it spirals in. There remains much con- 
troversy about the efficiency of orbital decay mechanisms as the binary separation 
approaches a parsec — known as the “last parsec problem”. Some (see, e.g. Refs. 69 
and 107) argue that dissipation mechanisms will quickly move the binary system 
through this phase, whereas others (see, e.g. Ref. 27) argue that the binary is likely 
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to stall at separations where the gravitational decay is ineffective. Detection of a 
stochastic GW background (GWB) would resolve this issue. 

Since large numbers of binary systems with different orbital periods contribute 
to the GWB, it is a broadband signal which is best described in the spectral domain. 
It is convenient to express the amplitude of the GW signal in terms of the dimen- 
sionless “characteristic strain”, defined by 


he = 2f|A(f)|, (23) 


where h(f) is the Fourier transform of h(t) and f is the GW frequency. (Note that, 
for a circular binary system, the frequency of the emitted GW is twice the binary 
orbital frequency f = 2f;.) Two other quantities that are often used to parametrize 
GW spectra and detector spectral sensitivities are the square root of the one-sided 
strain power spectral density 


Cras Ch as eae! (24) 


and the GW energy density as a fraction of the closure energy density of the universe 
Qn? 

Qew = sa heh), (25) 
0 


where Ho is the Hubble constant. 

In order to understand the astrophysical implications of results obtained from 
PTA experiments, it is necessary to have estimates of the likely strength of signals 
from potential sources of nanohertz GW. For a cosmological population of SMBH 
binary systems at luminosity distance D; and redshift z, the local energy density 
in GW at frequency f is given by: 


=) de) a. @) ° 
fSe(f) | 2 | dzdM, (1+ z) D? dinf, ee 


where M, = (Mj Mp2)3/>(M, + M2)~'/> is the binary chirp mass and M, and Mz 
are the masses of the binary components, d?n/(dzdM_) is the comoving density 
of binary systems with redshift and chirp mass between z and z+ dz and M, and 
M.+dM,, respectively, and dE, /d1n f, is the total energy emitted by a single binary 
system in the logarithmic frequency interval dln f,, where f, = f(1 + z).97-103:145 
The local GW energy density is related to the local characteristic strain by 


TC 


fS(f) = Fal he(f) (27) 
and for a circular binary system 


dE, G2/3472/8 


= M2/3 £2/3 | on 
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Therefore, we have 
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As Phinney! has emphasized, the result that he « f~?/% for a cosmo- 
logical population of circular binary systems decaying through GW emission is 
quite general and independent of any particular cosmology, black-hole mass func- 
tion or galaxy merger scenario. Consequently the spectrum of the GWB is often 
parametrized as follows: 


nlf) = Ary (ZL) (30) 


where fiyr = (lyr), Aiyr is the characteristic strain at fiyr and a = —2/3 for the 
case described above. For pulsar timing experiments, the one-sided power spectrum 
of the timing residuals is given by 


PUD) = yaa Falls): (31) 


Consequently, a GWB produces a very “red” modulation of the timing residuals 
with a spectral index of —13/3 for a = —2/3. 

In order to estimate the likely strength of this modulation, the factor 
d?n/(dzdM_.) in Eq. (29) must be evaluated. This requires a prescription for the 
cosmological evolution of massive black-hole binary systems in galaxies. Different 
approaches to this problem have been taken by different authors. An early paper by 
Jaffe and Backer® used observational constraints on close galaxy pairs coupled with 
a black-hole mass function, whereas another early paper by Wyithe and Loeb!** 
used a prescription for merger of dark-matter halos coupled with different scenarios 
for growth of massive black holes in galaxies. The latter approach was developed 
further by Sesana et al.'!° who showed that the GWB spectrum steepens at fre- 
quencies above about 107° Hz since the number of binary systems contributing to 
the background at these frequencies becomes small. This is illustrated in the left 
panel of Fig. 14 which shows that binary systems at z < 2 contribute most of 
the strain to the GWB. The right panel shows that massive binary systems with 
M. 2 10° Mo contribute most of the low-frequency GWB but that only systems 
with M, S 10° Mo contribute to the high-frequency end. Importantly, at the high- 
mass end, only a few systems contribute to the GWB. As shown in Fig. 15, this leads 
to a steepening and large uncertainties in the expected GWB spectrum at frequen- 
cies 2 107° Hz. These results were confirmed and extended by Sesana et al.'!® and 
Ravi et al.!°° using the Millenium simulation of the cosmological evolution of dark 
matter structures!?® to define a merger history together with various prescriptions 
for galaxy and black-hole formation and growth. 

While there is some consensus about the form of the GWB spectrum, there 
remain significant uncertainties. For example, Ravi et al.'°’ consider the effects of 
the stellar environment on the late evolution of massive black-hole binary systems in 
the cores of galaxies and conclude that the effect of dynamical friction is important, 
both in extracting energy from the binary system and inducing an eccentricity. Both 
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Fig. 14. Left panel: Number of binary black-hole systems and their contribution to the character- 
istic strain he of the GWB as function of source redshift z. The solid lines are results from a Monte 
Carlo approach and the dotted lines are from a semi-analytic analysis. In both panels, the upper 
histograms are for a GW frequency f = 8 x 10~° Hz and the lower histograms for f = 10—7 Hz. 
Right panel: The lower two panels are as for the left panel but as a function of source chirp mass 
M.. The upper panel shows the number of frequency bins spanned by the chirp over the 5-year 
span of the simulation (Ref. 115). 
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Fig. 15. Characteristic strain spectrum for the GWB based on different prescriptions for black- 


hole growth by accretion between mergers (solid, dashed and dotted lines) and different black-hole 
mass functions (different panels) (Ref. 116). 
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of these have the effect of reducing the strength of the predicted GWB, especially at 
the low-frequency end, consequently making its detection by PTAs more difficult. 
On the other hand, McWilliams et al.8° argue for a model in which all black- 
hole growth is by merger rather than by accretion after coalescence, which is the 
main contributor to black-hole growth in the models discussed above. This leads to 
predictions of a significantly larger GWB characteristic strain compared to previous 
predictions and, hence, the imminent detection of the GWB by PTAs. 

As Fig. 15 indicates, there is a possibility that the nearby universe could contain 
a massive black-hole binary system with an orbital period of the order of a few years 
that actually dominates the nanohertz GW spectrum. This opens up the exciting 
prospect of the GW detection and study of an isolated supermassive black-hole 
binary system using pulsar timing and even the possibility of detection and study of 
the system in the electromagnetic bands — so-called “multi-messenger” astronomy. 
Searches for binary GW sources will be described in Sec. 3.3 and the prospects for 
their detailed study will be discussed in Sec. 3.4. 


3.2.2. Cosmic strings and the early universe 


Cosmic strings and the related cosmic super-strings are one-dimensional topological 
defects which may have formed in phase transitions in the early universe. Cosmic 
strings occur in standard field-theory inflation models, whereas superstrings are 
found in brane inflationary models. The idea that such strings will oscillate and 
hence emit GWs was first proposed by Vilenkin.!4? Such oscillations may contribute 
to the stochastic GWB (see, e.g. Ref. 23) or generate bursts of GW radiation from 
string cusps and kinks (see, e.g. Ref. 36). The amplitude of GWs from cosmic strings 
is dependent on a large number of poorly known (or unknown) parameters and 
hence is very uncertain (see, e.g. Ref. 111). Key parameters are the string tension 1, 
usually parametrized by the dimensionless quantity Gu/c?, and the size a of string 
loops relative to the horizon radius at the time of birth. Other significant parameters 
for the GWB are the intrinsic spectral index q of the GW emission, a characteristic 
node number n, for the high-frequency cutoff in the emission spectrum and the 
probability p of “intercommutation”, that is, intersecting strings dividing and the 
two parts exchanging. Such intercommutation can, for example, form two smaller 
loops from an intersecting twist in a larger loop. For standard strings, p = 1 but it 
may be less for superstrings. 

Vibrating cosmic strings are likely to decay by emission of GWs in a series of 
harmonics with fundamental frequency 2c/1, where / is the length of the loop, with 
an initial value aDy, where Dy is the horizon distance at the time of loop creation. 
The rate of energy loss for vibration mode n of a given loop is: 

dEaw n-4 


=T ; 2 
7 = Gye, (32) 


dm 
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where I’ is factor depending on the shape of the loop, typically about 50.11! As 
the loop loses energy, it shrinks and eventually disappears. The creation of loops 
through intercommutation and their decay through GW emission sets up an equi- 
librium distribution of loop sizes. Sanidas et al.!4! compute the number density of 
loops as a function of loop length and time and hence, using Eq. (32), the predicted 
spectrum of the GWB from string loops as a function of the various parameters. As 
Fig. 16 shows, the spectrum is very broad, extending all the way from nanohertz to 
Megahertz. Since the lowest (n = 1) frequency is proportional to the string length, 
the weaker and smaller loops do not contribute to the nanohertz background. 


3.2.3. Transient or burst GW sources 


The prime target of ground-based laser-interferometer GW detectors is the burst 
emitted at the coalescence of a double-neutron-star system. Such a burst is intense 
for just a few milliseconds and clearly cannot be detected by PTA experiments 
which typically are sensitive to signals of duration between a few weeks and a few 
years. Possible sources of longer-duration bursts are coalescence of SMBH binary 
systems, highly eccentric massive black-hole binary systems, and formation and 
decay of cusps and kinks in cosmic strings. 

A major difference between detection of GW bursts and continuous GW sources 
is that there is generally no interference between the Earth term and the pulsar 
terms in the signal detected by PTAs [Eq. (20)]. This is because the duration of 
the burst (by definition, less than the observational data span) is much less than 
the light-time to the pulsars and so, except for a source in the same direction as a 
pulsar, the burst will occur at very different times in the Earth and pulsar terms. 
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Fig. 16. Energy density spectrum for GWs emitted by cosmic strings as a function of the dimen- 
sionless string tension G/c?. Other parameters are held fixed at values a = 107", gq = 4/3, 
n» = 1 and p= 1. The dashed spectrum is for the critical point where TGyu/c? © a; spectra above 
this are for large loops and spectra below are for small loops (Ref. 111). 


Pulsars and gravity 1-441 


The pulsar term reflects the effect of the burst on the pulsar at a time d/c before the 
burst arrives at the Earth, but the pulsar term is detected at a time (d/c)(1+ cos 0) 
later than the Earth term, where d is the pulsar distance and @ is the angle between 
the pulsar and the GW propagation direction as seen from the Earth. 

Figure 17 shows the GW waveform produced by the coalescence of two black 
holes in a coordinate system where all the signal is in h_. The maximum amplitude 
of the waveform is about 0.1 in the time units of Fig. 17, or about 0.1cMT,/D 
or ~ 10-14M5/Depe where My = 10~9M is the total system mass in solar units 
and Dépe is the (comoving) distance in Gpc. Although this is comparable to the 
strain sensitivities achieved by current PTAs for continuous GW signals (see Sec. 3.3 
below), even for the largest SMBHs, the timescale of the burst, ~ 200 MT¢ is only 
of order 10 days and the period of the oscillation is about an order of magnitude 
less than that. Not only is this too short to be resolved by any existing PTA, but 
the sensitivity over this short interval would be much less than that achieved for 
CW signals integrated over the entire data span. Space-based laser interferometer 
systems such as the planned eLJSA? have the potential to directly detect these 
bursts. 

However, Fig. 17 also shows another effect known as GW “memory” which is 
potentially detectable with pulsar timing. During the coalescence event, a non- 
oscillatory component to the gravitational strain builds up, so that at the end of 
the “ring-down” phase, the strain has a permanent offset from the pre-coalescence 
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Fig. 17. Gravitational waveform resulting from the coalescence of two equal-mass black holes 
(reduced mass ratio 7 = 0.25) with total mass M at distance R. Both M and R are expressed 
in time units; the conversions to conventional units are M — MT and R > R/c. © and ® are 
assumed source directions. The dashed line is the predicted waveform if the gravitational memory 
effect is ignored (Ref. 44). 
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value. The amplitude of the memory effect is 


enMTs 15 Mo 
hya= ——— 2 10°", 33 
Cc 


where € © 0.07 is the mass fraction contributing to the memory effect and 7 is the 
reduced mass fraction, 0.25 for equal-mass binary components.??44 

This step change in h produces a step change in the observed pulse frequency 
Av /v = hm, i.e. a “glitch”. This glitch persists until it is reversed by the pulsar term 
at a time (d/c)(1+cos@) later. Ina PTA, these reversals will occur at different times 
for different pulsars. Of course, it is also possible that a GW memory jump could 
be detected in the pulsar terms, but there it may be confused with a real glitch in 
the intrinsic pulse frequency, whereas in the Earth term there is a correlation in the 
effect on different pulsars. However, glitches in MSPs are rare (only one very small 
glitch detected so far: Ref. 29). Also, real pulsar glitches are generally spin-ups, 
whereas a GW-memory jump may be of either sign, depending on pulsar-source 
angle. Therefore, as Cordes and Jenet®° have discussed, the pulsar terms may give 
an improved probability of detection. 

Black-hole binary systems with circular orbits emit GW at the second har- 
monic of the orbital frequency, i.e. f = 2f,. For eccentric orbits, the GW emission 
becomes more burst-like as the accelerations and hence GW power are greatest 
around periastron when the two black holes are closest together.'°? In the spectral 
domain, power spreads to higher harmonics and also to the fundamental frequency 
fy. In the gravitational-decay phase of evolution, when energy loss is dominated 
by GW emission, the orbit tends to circularise.? However, at earlier phases of the 
orbital decay when three-body stellar interactions or interaction with a gaseous 
disk surrounding the binary system are important, the eccentricity may grow. Stel- 
lar three-body interactions can result in orbital decay through dynamical friction, 
but probably result in a modest increase in the eccentricity of the black-hole binary 
system (e.g. Ref. 90). However, Roedig et al.1'° find that initially mildly eccentric 
binary systems decaying through interaction with a gaseous disk evolve toward a 
limiting eccentricity in the 0.6-0.8 range. In a regime of frequent mergers it is even 
possible that a black-hole triplet could form* and, in this case, eccentricities as high 
as 0.99 could exist. 

Finn and Lommen*’ have investigated the GW emission and the resulting timing 
residuals from a close parabolic encounter of two massive black holes. As Fig. 18 
shows, an encounter of two 10°Mo black holes located at a distance of 15 Mpc 
with a minimum separation of 0.02 pc produces a GW burst of duration about 1 
year and maximum GW strain ~ 1071!%. This results in potentially detectable PTA 
timing residuals of about lus amplitude with the same timescale. Unfortunately, 
the probability of having such a close encounter of two SMBHs in the local universe 
within PTA observational data spans is not high. 

Cosmic strings are another potential source of GW burst emission. They can 
radiate over a wide frequency range with a huge range of possible amplitudes and 
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Fig. 18. The upper plot shows the gravitational waveforms in the h, and h— polarizations 
resulting from a parabolic encounter with an impact parameter of 0.02 pc of two 10° Mo black 
holes located at a distance of 15 Mpc in the direction of the Virgo Cluster. The lower plot shows 
the resulting timing residuals for several PTA pulsars (Ref. 47). 


timescales (Fig. 16) depending on the detailed mechanism invoked (loops, cusps, 
short strings, etc.) and the very wide (virtually unlimited) parameter space." 127 
Short bursts radiating in the LIGO and eLISA bands may be frequent and 
unresolved, producing a GWB at these frequencies. However, bursts with longer 
timescales, producing radiation in the nanohertz band, are also possible but are 
likely to be extremely rare.°° Consequently, while in principle such bursts could be 
detected, in practice it is unlikely that PTAs will be able to significantly constrain 
models for GW burst emission from cosmic strings. 


3.3. Pulsar timing arrays and current results 


In this section, we first describe the three main PTAs currently operating world- 
wide: the European Pulsar Timing Array (EPTA), the North American pulsar 
timing array (VANOGrav) and the Parkes Pulsar Timing Array (PPTA), and the 
collaboration between them, the International Pulsar Timing Array ([PTA). PTAs 
have many possible applications such as establishing a pulsar-based timescale,°” 
investigating the accuracy of solar-system ephemerides,?° and investigating the 
properties of the pulsars themselves (e.g. Refs. 156 and 118) and of the inter- 
vening interstellar medium (e.g. Ref. 67). However, here we concentrate on what 
is undoubtedly their primary scientific goal, the direct detection of gravitational 
waves. Unfortunately, in common with other GW detection efforts around the world, 
PTAs have so far only been able to place limits on the strength of signals from poten- 
tial GW sources. However, these limits are now beginning to seriously constrain the 
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astrophysical source models and the assumptions that go into them and hence have 
implications that go far beyond the GW studies themselves. 


3.3.1. Existing PTAs 


The EPTA uses five large radio-telescopes in Europe, the Effelsberg 100m tele- 
scope in Germany, the Nangay Radio Telescope in France (95m equivalent area), 
the Westerbork Synthesis Radio Telescope in the Netherlands (similar effective 
area to the Nangay telescope), the 76m Lovell Telescope at Jodrell Bank in Eng- 
land and the recently completed 64m Sardinia Radio Telescope in Italy, to observe 
about 40 MSPs with a cadence of between a few days and 30 days for different 
pulsars.”? Different telescopes observe at different frequencies in the range 0.3- 
2.6 GHz, but all are instrumented at 1.4GHz. Normally the five telescopes observe 
independently, but in a project known as the “Large European Array for Pulsars” 
(LEAP) 1.4GHz signals over a bandwidth of 128 MHz from the five telescopes can 
be summed coherently to form a 194m equivalent diameter radio telescope. The 
different telescopes use different signal-processing systems, either digital filterbanks 
or coherent dedisperion systems or both. 

NANOGrav makes use of the 300m Arecibo radio telescope in Puerto Rico and 
the 100m Green Bank Telescope (GBT) in West Virginia.8’ A sample of about 36 
pulsars is observed, typically at 3 week intervals. At Arecibo, observations are made 
in bands centered at 430 MHz and 1410MHz, whereas at the GBT, the observed 
bands are centered at 820 MHz and 1500 MHz. Currently both radio telescopes use 
coherent dedispersion systems with bandwidths up to 800 MHz, but in the past a 
range of filterbank and coherent dedispersion systems with more limited bandwidths 
have been used. 

As the name suggests, the PPTA uses the Parkes 64m radio telescope located 
in New South Wales, Australia. A sample of 22 pulsars is currently being observed 
with regular observations at 2-3 week intervals in three bands around 730 MHz, 
1400 MHz and 3100 MHz respectively.°° °° Coherent dedispersion systems are used 
at 730MHz and 1400MHz with bandwidths up to 310MHz and digital filter- 
banks at 1400MHz and 3100MHz with bandwidths of 256 MHz and 1024 MHz, 
respectively. 

Data sets from all three PTAs have spans ranging from a few years up to about 
20 years for different pulsars; three (including the original MSP, PSR B1937+21) 
have Arecibo data spans of nearly 30 years. The three PTAs together observe about 
50 pulsars with some being observed by two or even all three of the PTAs. Their 
distribution on the sky is shown in Fig. 19. 

Given that the combined data set of the three PTAs contains a larger number 
of pulsars, improved observation cadence and greater frequency diversity than the 
data set of any one PTA, there is a strong motivation to combine all the available 
data sets to obtain maximum sensitivity for PTA scientific objectives. The [PTA 
consortium was set up to facilitate progress toward this goal.8° The IPTA also 


Pulsars and gravity 1-445 


Dec=+90° 


RA=12" 


* NANOGrav 
© PPTA 


Fig. 19. Distribution on the sky of MSPs being timed by the three PTAs, with different symbols 
for each PTA. Right ascension increases to the left with 0 at the plot center. The symbol size is 
related to the ratio S1490/P, where Si4090 and P are the pulsar 1400 MHz flux density and pulse 
period respectively (Ref. 85). 


arranges annual science meetings and student workshops and provides a forum for 
outreach programs and other activities related to PTA research. 


3.3.2. Limits on the nanohertz GW background 


As discussed in Sec. 3.2.1, GWs from a cosmological distribution of SMBH binary 
systems are expected to contribute a very “red” signal to the spectrum of pulsar 
timing residuals. The expected signal from other GWB sources is similar. Con- 
sequently, long-term observations of a single pulsar with little or no detectable 
intrinsic timing irregularities can be used to place a limit on the strength of the 
GWB in the Galaxy. Of course, statistical limits can be improved by using data 
from several such pulsars. An early limit on the GWB at a frequency of 4.5nHz 
was set by Kaspi et al.® using Arecibo observations of two MSPs, PSR B1855+09 
and PSR B1937+21, with a 95% confidence limit on Qgwh? of 6 x 1078, where 
h = Ho/100kms7}. 

Since the advent of the various PTA projects, both the quality and quantity 
of timing data sets has improved and a variety of analysis techniques have been 
employed to extract increasingly restrictive limits. Based on early PPTA data on 
seven pulsars combined with the Kaspi et al. PSR B1855+09 data set, Jenet et al.®4 
used a “frequentist” approach with a statistic based on the amplitude of the low- 
frequency components in the power spectrum of the timing residuals to set a 95% 
confidence limit of about 2 x 107-8 on Qgw at a GW frequency of 1/8 yr or 4nHz. 
From Eqs. (25) and (30), this result is equivalent to a characteristic strain at fre- 
quency 1/lyr, Ajyr © 1.1 x 107“. 

van Haasteren et al.'88 analyzed EPTA 1400 MHz data sets for five MSPs with 
spans of 5-8 years using a Bayesian analysis to place limits on the GWB amplitude 
as a function of its spectral index a. For a = —2/3, the derived limit at the 95% 
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confidence level is Ajy; © 6 x 10-15, about a factor 1.8 better than the Jenet et al.°4 
limit. 

NANOGrav multi-band data sets recorded between 2005 and 2010 for 17 MSPs 
were analyzed by Demorest et al.4° Timing analyses taking into account time- 
varying dispersion delays and frequency-dependent pulse profiles were carried out 
to form sets of post-fit residuals and the corresponding covariance matrices for each 
pulsar. For most MSPs in the sample, no red noise signal was detectable in the post- 
fit residuals. Considering just the pulsar with the smallest post-fit residuals, PSR 
J1713+0747, and taking into account absorption of red noise by the timing fit, 
Demorest et al. obtained a 95% confidence limit for Aj, of 1.1 x 10-'4. A separate 
cross-correlation analysis weighted by the expected Hellings and Downs function 
[Eq. (22)] across all the pulsars in the sample resulted in a somewhat better limit 
Aiyr © 7 x 1071, although this limit was dominated by correlations with the two 
best pulsars in the sample, PSRs J1713+0747 and J1909—3744. 

Based on PPTA and earlier Parkes timing observations made in three observing 
bands centered near 700 MHz, 1400 MHz and 3100 MHz respectively with data spans 
of up to 17 years,°° together with the PSR B1855+09 archival Arecibo data,®° 
Shannon et al.!!9 placed a limit of 1.3 x 107° on Qew at a GW frequency of 
2.8nHz. The corresponding limit on Aj,, assuming a GW spectral index of —2/3 
is 2.4 x 107). This analysis, which included dispersion correction for the PPTA 
data sets and was based on the six best PPTA pulsars, used a statistical method 
similar to that of Jenet et al.°* but included modeling of the red noise in the timing 
residuals. As shown in Fig. 20, this limit rules out a model in which the growth of 
SMBH in galaxies is dominated by mergers®? at the 91% confidence level, but is 
consistent with other models for galaxy and SMBH evolution where much of the 
SMBH growth is by accretion. 

As discussed in Sec. 3.2.2, topological defects in the early universe are another 
potential source for the GWB. Figure 21 shows limits on the dimensionless string 
tension Gu/c? as a function of loop size for various sets of other relevant param- 
eters.!1! The middle solid curves are limits based on the current EPTA data sets 
and the lower dashed curves are projections for LEAP data sets that coherently 
combine data from the EPTA telescopes. The upper dot-dashed line is a limit based 
on LIGO data.! The current EPTA results give a conservative upper limit on the 
string tension of 5.3 x 10~". Lower limits can be obtained with more restricted 
assumptions about the string parameters (e.g. Refs. 37, 64 and 138). 


3.3.3. Limits on GW emission from individual black-hole binary systems 
For an isolated binary system at a luminosity distance dz, the intrinsic GW strain 


amplitude is given by 


(GM)? (xf Pr 
cA dr , 


ho = (34) 
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Fig. 20. Limits on the relative energy density of the GWB, Qa@yw at a GW frequency of 2.8nHz 
based on the PPTA data sets, together with predictions for Q@w based on several different models 
for the GWB."29 The solid and dashed lines that are asymptotic to 1.0 at low Qgw show the 
probability Pr that a GWB signal of energy density Qgy can exist in the PPTA data sets, 
based on Gaussian and non-Gaussian GWB statistics respectively. The shaded region is ruled 
out with 95% confidence by the PPTA data. Corresponding limits from analysis of EPTA!%® 
and NANOGrav*° data sets, scaled to few = 2.8nHz, are also shown. The Gaussian curves 
show the probability density functions pjy for the existence of a GWB with energy density New 
based on a merger-driven model for growth of SMBHs in galaxies,89 an empirical synthesis of 
observational constraints on SMBHs in galaxies,!!% and based on the Millennium dark matter 
simulations!® together with semi-analytic models for growth of SMBHs in galaxies (see Ref. 119 
for more details). (For color version, see page I-CP7.) 


where M, is the binary chirp mass and f = 2f, is the GW frequency. The actual 
observed signal depends on the orbital orientation and phase as well as the GW 
polarization angle. By averaging over these quantities, PTAs can set probabilistic 
limits for the strain amplitude as a function of f, both in a given direction and 
averaged over all directions (see, e.g. Refs. 9 and 158). In these analyses, there is 
assumed to be negligible evolution of f over the data span and only the Earth term 
[Eq. (20)] is considered, because of uncertainties in the pulsar distances, the pulsar 
terms cannot be added coherently and just contribute noise. 

Figure 22 shows both sky-averaged upper limits and detection sensitivity for 
continuous-wave GW signals as a function of GW frequency based on the PPTA 
data set and using a frequentist analysis method.!°* The best limits and sensitivity 
are obtained for GW frequencies around 10~* Hz, where the upper limit on ho 
is about 1.1 x 10~-'4. A similar analysis of the five-year NANOGrav data set by 
Arzoumanian et al.° using both frequentist and Bayesian analysis methods gave a 
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Fig. 21. Limits on cosmic string tension as a function of the loop size scale parameter a for 
different cutoff node numbers n, and intrinsic spectral slopes q for the current EPTA limit on 
the energy density of the GWB (thick lines) and the projected sensitivity of the LEAP PTA 
(corresponding thin lines). The line with one long dash and three short dashes is an analytic 
approximation which is valid for large loops. The uppermost line is the LJGO limit at f = 1kHz 
(Refs. 111 and 1). 


somewhat higher sky-averaged upper limit of about 5 x 1074 at 107° Hz. Because 
of the uneven sky distribution of PTA pulsars, there is quite a strong dependence 
of sensitivity on source direction. This is illustrated in Fig. 23 which shows that 
sensitivity is greater toward the greatest concentration of PTA pulsars, roughly 
toward the Galactic Center. 

Figure 22 also shows that we can effectively rule out the existence of SMBH 
binary systems with M, = 109 Mo and orbital frequencies around 107° Hz at dis- 
tances closer than 30 Mpc. Similarly, a system with M, = 10!° Mo at a distance of 
400 Mpc should be detectable. Unfortunately, as Fig. 23 shows, the nearby galaxy 
clusters such as Virgo, Coma and Fornax are all in regions of relatively low sen- 
sitivity for the PPTA (and other PTAs), so the effective limits for these clusters 
are a factor of a few higher. It is unlikely that such massive binary systems exist 
in these clusters. More generally, limits can also be placed on the SMBH binary 
coalescence rate in the nearby universe (z S$ 0.1). Based on the PPTA results, Zhu 
et al.®8 place a 95% confidence limit of 4 x 1073(10!°Mo/M-)!9/3 Mpc? Gyr7* 
on the coalescence rate. This limit is about two orders of magnitude above current 
estimates of the galaxy merger rate in the local universe (Ref. 9). 


Pulsars and gravity J-449 


-7 


10 


Frequency (Hz) 


Fig. 22. Sky-averaged limits on the intrinsic GW strain amplitude ho as a function of GW 
frequency f based on the PPTA data sets. The lower curves represent the largest GW signal 
(with a false-alarm probability of 1%) that could be present in the real PPTA data (dashed line) 
and a simulated data set (solid line). The upper solid line gives the sensitivity of the PPTA to 
a continuous-wave source, i.e. the minimum signal that could be detected with 95% probability. 
The upper limits and sensitivities are higher at frequencies of 1/lyr and 1/6months as these 
frequencies are absorbed by the timing fits for position and parallax respectively. The sloping 
dot-dashed lines are the expected signal levels for a SMBH binary systems with M. = 10!°9 Mo 
and distance 400 Mpc (upper line) and M. = 10° Mo and distance 30 Mpc (lower line) (Ref. 158). 
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Fig. 23. Sky distribution of the luminosity distance dz, to which a binary system with chirp 
mass M, = 10? Mo radiating at 10—8 Hz could be detected. The stars indicate the positions of 
the 20 PPTA pulsars and the diamonds are potential sources of GW continuous-wave emission 
(Ref. 158). 
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These rate estimates are based on the orientation-averaged and sky-averaged 
amplitudes. It is of course possible that a favorably oriented and located SMBH 
binary system in the late stages of coalescence could exist. On the other hand, the 
estimates are based on circular binary orbits and, as discussed in Secs. 3.2.1 and 
3.2.3, short-period SMBH binaries may have significant eccentricity which reduces 
the GW power at the fundamental frequency f = 2, and hence the detectability of 
such systems. On balance, it seems unlikely that GW from an individual coalescing 
SMBH binary system will be detected with the current generation of PTAs. 


3.4. Future prospects 


PTAs have now achieved data spans and ToA precisions that would allow detection 
of the GWB predicted by some models for the evolution of galaxies and the SMBHs 
at their core (see, e.g. Ref. 119). Up to now, no detections have been made. While 
this is disappointing from the point of view of GW astrophysics, it is starting to 
have important implications for galaxy and SMBH evolution models and to rule 
out some scenarios. It also implies that PTAs are close to detecting the GWB if 
current predictions for its amplitude are correct. 

Siemens et al.!2° have considered the sensitivity of an idealized PTA to a GWB. 
At low signal levels, when the lowest signal frequencies are below the white noise 
level, the detection signal-to-noise ratio (S/N) is 


A? 8 
yr 


where M is the number of pulsars in the array, c is the observing cadence (frequency 
of observations), Ajy, is the GWB amplitude [Eq. (30)], o is the rms level of the 
white timing noise, T is the observing data span and ( is the inverse spectral index 
of the GWB signal in the timing residuals, taken to be 13/3 [Eq. (31)]. In the 
detection regime where the GWB signal exceeds the white noise level, the S/N is 


(p) x M (2) “ rue. (36) 


Consequently, in the pre-detection regime, the S/N increases rapidly with increased 
observing cadence and data span and decreased timing noise, but has a much weaker 
dependence on these parameters in the strong signal regime. The reason for this is 
that noise from the uncorrelated pulsar term, which is also proportional to Ajyr, 
dominates over the white “receiver” noise, greatly modifying the statistical behav- 
ior. Importantly though, in both regimes, the S/N is proportional to M, the number 
of pulsars in the PTA. Figure 24 illustrates these dependencies for a range of plau- 
sible future PTAs. 

As Siemens et al.1?° point out, if Alyr © 10-!°, current PTAs are already in 
the “strong signal” regime. This means that increasing the observing data spans 
and cadence or decreasing ToA uncertainties has limited effect on the S/N of a 
potential detection. Increasing the number of pulsars in the PTA is a much more 
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Fig. 24. Detection S/N for a GWB as a function of PTA data span for four different, but plausible, 
future PTAs. See text for the meaning of the PTA parameters (Ref. 126). 


cost-effective way to increase detection sensitivity. This fact provides much of the 
motivation to combine data from existing PTAs to form the JPTA. In the future, 
the Chinese FAST radio telescope®4 and the SKA?‘ will provide a large increase 
in radiometer sensitivity compared to existing instruments. The discussion above 
shows that this increased sensitivity will be best employed in increasing the number 
of (weaker) pulsars that are timed, rather than improving the ToA precision on the 
stronger existing PTA pulsars. Considerations of “jitter noise” in ToAs resulting 
from shape variations in individual pulses!!® lead to the same conclusion. 

While direct detection of the GWB would be enormously exciting and signifi- 
cant, there is no doubt that direct detection of GW from individual SMBH binary 
systems is potentially much more interesting from an astrophysical perspective. It 
opens up the possibility of identifying a GW source with source or region identi- 
fied through electromagnetic-wave (radio, optical, X-ray or y-ray) emission and the 
advent of “multi-messenger” astronomy (e.g. Refs. 78, 114 and 22). For example, 
many active galactic nuclei (AGNs) show evidence of a close binary SMBH at their 
core. Examples of this include X-shaped radio lobes (e.g. Ref. 157), double-peaked 
or variable emission lines from the core region (e.g. Ref. 124), quasi-periodic mod- 
ulation of core radio (e.g. Ref. 137) or X-ray emission (e.g. Refs. 114 and 79), and 
direct imaging of double AGN (e.g. Ref. 109). 

With one exception, existing PTA systems do not have sufficient sensitivity to 
detect these potential GW sources. The exception is the claimed SMBH binary 
system identified by VLBI astrometry of the nearby quasar 3C 66B!°? which was 
effectively ruled out by pulsar timing observations.® Future PTAs including FAST 
and the SKA will have much increased sensitivity, making searches for other GW 
candidate sources potentially more productive. 
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Similarly, although blind searches for continuous-wave GW signals in current 
PTA data sets have a low probability of successful detection, in future this should 
not be the case. The possibility of identification of a source galaxy or AGN then 
depends critically on the accuracy of the position determination for the GW source. 
When only detection of the “Earth term” is considered, the accuracy is at best 
many tens of square degrees (e.g. Ref. 158) containing thousands if not millions of 
galaxies. Only a correlation of the GW signal with a modulation of some property 
of a galaxy core (e.g. intensity or velocity) would establish an identification. 

The situation changes dramatically if the pulsar terms can be added coherently 
with the Earth term [Eq. (20)]. Each Earth—pulsar system then forms an inter- 
ferometer with baseline d and fringe spacing ~ AGgw/[2rd(1 + )], where d is the 
pulsar distance, Agw is the GW wavelength and ¥ is the direction cosine between 
the pulsar direction and the GW propagation direction.!9'"* Positional accuracies 
are roughly the fringe spacing divided by the S/N of the GW detection. Since d is 
typically > 10° light-years and Agw is a few light-years, sub-arc-minute positional 
accuracies are possible for the stronger sources. However, to achieve the coherent 
summation, the distance to the pulsars must be known to better than Agw/(1+7), 
i.e. about 1 pc unless the GW source is nearly aligned with the pulsar. Currently, 
only one PTA pulsar, PSR J0437—4715, has a distance known to this accuracy, mea- 
sured through VLBI astrometry,? but in the SKA era this will change. As Boyle 
and Pen!® pointed out, with a high density of PTA pulsars on the sky, advantage 
can be taken of the (1 + -y) factor and so pulsars with less precisely determined 
distances located in the general direction of the GW source could be useful. In this 
situation, PTAs would not be confusion-limited and, in principle, many individual 
GW sources could be identified. With a high enough number of PTA pulsars (say 
1000 or more) it may be possible to localize a binary GW source by the quadrupo- 
lar pattern of timing residuals in pulsars surrounding the source, even if the pulsar 
distances are poorly determined. 


4. Summary and Conclusion 


Nature has been very kind in providing us with a set of near-perfect celestial clocks, 
many in situations of rapidly varying gravitational accelerations. Not only are these 
celestial clocks, known as pulsars, precise time-keepers, they are also exceedingly 
compact. This enables them to be treated as point masses in theoretical analyses of 
their motion and also permits tests in the regime of strong gravitational fields. These 
qualities result in a very wide range of applications for pulsar time-keeping, most 
importantly, at least in the context of this review, to investigations of relativistic 
gravitation. 

Observations of double-neutron-star systems, wide circular pulsar—white-dwarf 
systems and even isolated pulsars have been used to test the accuracy of gravita- 
tional theories. Remarkably, GR is unscathed by all of these tests and hence remains 
the most viable theory of gravitation. Pulsar timing has provided the strongest 
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available limits on at least six parameters describing deviations from GR. Con- 
tinued and improved pulsar timing measurements, especially with new and highly 
sensitive radio telescopes such as FAST and the SKA, will both improve on these 
limits and enable new and different tests of relativistic gravity. They may even 
demonstrate a failure of GR to adequately account for the observations, leading to 
new or modified theories of gravitation. 

Continuing and new searches for previously unknown pulsars, especially with 
FAST and the SKA, will not only increase the number of pulsars that can be used in 
tests of relativistic gravity. They will also turn up new and exciting classes of object 
such as the recently discovered triple system, PSR J0337+1715. Such discoveries 
enrich the investigations of relativistic astrophysics that can be undertaken with 
pulsars. 

One of the outstanding goals of current astronomy and astrophysics is the direct 
detection and study of GWs. PTAs provide a viable mechanism for detection of 
GWs with frequencies in the nanohertz range. They therefore complement other 
existing or planned instruments such as the laser-interferometer systems LIGO and 
eLISA which are sensitive to GWs at frequencies of around 100 Hz and millihertz 
respectively. The most probable sources for GW detection by PTAs are binary 
super-massive black holes in the cores of distant galaxies. These produce an unre- 
solved background of GWs that is potentially detectable, but there may also be 
individual binary systems that could be detected by PTAs. 

There are currently three major PTAs operating, one each in Europe (EPTA), 
North America (VANOGrav) and Australia (PPTA). Up to now, no GWs have 
been detected by PTAs (or other GW detection systems) so the direct detection 
of GWs remains a goal. However, recent limits on the GW background are placing 
significant constraints on existing models for galaxy mergers over cosmological time 
and the formation and evolution of super-massive black holes in the cores of these 
galaxies. For example, a model in which black-hole growth is dominated by mergers 
is essentially ruled out. 

The sensitivity of PTAs to GWs is a function of several factors including the 
precision of the pulse arrival-time measurements, the data span of the PTA observa- 
tions and the cadence or frequency of observations within this data span. However 
the most important single factor is the number of pulsars in the PTA. Of course, 
these pulsars must meet certain timing-precision and period-stability criteria to use- 
fully contribute to a PTA. There are two main approaches to increasing the number 
of pulsars. First, existing data sets can be combined to form a single PTA — this 
is the goal of the IPTA project. Second, searches can be undertaken to increase the 
number of known pulsars suitable for PTA projects. With FAST and the SKA it 
is possible that hundreds of MSPs that are suitable for PTA projects will be both 
discovered and subsequently timed to high precision. This will surely lead to the 
detection of GWs and to detailed investigations of both the GWs themselves and 
the sources that generate them. 
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After giving a brief introduction and presenting a complete classification of gravitational 
waves (GWs) according to their frequencies, we review and summarize the detection 
methods, the sensitivities and the sources. We notice that real-time detections are pos- 
sible above 300 pHz. Below 300 pHz, the detections are possible on GW imprints or 
indirectly. We are on the verge of detection. The progress in this field will be promising 
and thriving. We will see improvement of a few orders to several orders of magnitude in 
the GW detection sensitivities over all frequency bands in the next hundred years. 


Keywords: Gravitational waves (GWs); GW spectrum classification; GW sources; meth- 
ods of detecting GWs; GW sensitivities. 


1. Introduction and Classification 


Soon after the proposal of general relativity (GR), Einstein predicted the existence 
of gravitational waves (GWs) and estimated its strength from the wave equation 
he obtained in his 1916 paper on “Approximative Integration of the Field Equa- 
tions of Gravitation”.! Toward the end of his paper, he obtained the expression 
of the radiation A of the system per unit time in GR as (Eq. (23) in his paper) 
A = (k&/247)Nae(0? Jag /Ot?)? with Jag defined as the time-variable components of 
moment of inertia of the radiating system ( = 87G in terms of Newtonian gravi- 
tational constant Gy ).* He then continued that “This expression (for the radiation 
A) would get an additional factor 1/c* if we would measure time in seconds and 
energy in Erg (erg). Considering « = 1.87- 10-7 (in units of cm and gm), it is 


“This radiation formula is corrected with the trace contribution of the moment of inertia subtracted 
and the overall factor replaced by «/807 [a factor 2 off compared with (37)] in Einstein’s next 
paper on GWs.? With his correction, Einstein noted that “This result shows that a mechanical 
system which permanently retains spherical symmetry cannot radiate... .” 
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obvious that A has, in all imaginable cases, a practically vanishing value.” Indeed 
at that time, possible expected source strengths and the detection capability had a 
huge gap. However, with the great strides in the advances of astronomy and astro- 
physics and in the development of technology, this gap is largely bridged. White 
dwarf was discovered in 1910 with its density soon estimated. Now we understand 
that GWs from white dwarf binaries in our Galaxy form a stochastic GW back- 
ground (“confusion limit”)? for space (low frequency) GW detection in GR. The 
first artificial satellite Sputnik was launched in 1957. However, at present the space 
GW missions are only expected to be launched in about 19 years later (~2034).4 

The existence of GWs is the direct consequence of GR and unavoidable con- 
sequence of all relativistic gravity theories with finite velocity of propagation. 
Maxwell’s electromagnetic theory predicted electromagnetic waves. Einstein’s GR 
and other relativistic gravity theories predict the existence of GWs. GWs propagate 
in spacetime forming ripples of spacetime geometry. 

The role of GW in gravity physics is like the role of electromagnetic wave in 
electromagnetic physics. The importance of GW detection is two-fold: (i) as probes 
to explore fundamental physics and cosmology, especially black hole physics and 
early cosmology and (ii) as a tool in astronomy and astrophysics to study compact 
objects and to count them, complement to electromagnetic astronomy and cosmic 
ray (including neutrino) astronomy. 

The existence of gravitational radiation is demonstrated by binary pulsar orbit 
evolution.**° In GR, a binary star system would emit energy in the form of GWs. 
The loss of energy results in the shrinkage of the orbit and shortening of orbital 
period. Based on more than 32 years (from 1974 through 2006) of timing observa- 
tions of the relativistic binary pulsar B1913+16, the cumulative shift of peri-astron 
time is over 43s. The calculated orbital decay rate in GR using parameters deter- 
mined from pulsar timing observations agreed with the observed decay rates. From 
this and a relative acceleration correction due to solar system and pulsar system 
motion, Weisberg, Nice and Taylor® concluded that the measured orbital decay to 
the GR predicted value from the emission of gravitational radiation is 0.997 40.002 
providing conclusive evidence for the existence of gravitational radiation as their 
previous papers. Kramer et al.’ did an orbit analysis of the double pulsar system 
PSR J0737-3039A/B from 2.5 years of pulse timing observations and found that the 
orbit period shortening rate 1.252(17) agreed with the GR prediction of 1.24787(13) 
to 1.003(14) fraction. Freire et al.8 analyzed about 10 years of timing data of the 
binary pulsar J1738 + 0333 and obtained the intrinsic orbital decay rate to be 
(—25.9+3.2) x 10-1, agreed well with the calculated GR value (—27.74}'3) x 10-4 
using the determined orbital parameters. Further precision and many more systems 
are expected in the future for observable GW radiation reaction imprint on the 
orbital motion. 

The usual way of detection of GW is by measuring the strain Al/l induced 
by it. Hence, GW detectors are usually amplitude sensors, not energy sensors. 
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The detection of GWs can be resolved into characteristic frequencies. The conven- 
tional classification of GW frequency bands, as given by Thorne? in 1995, was into 
(i) High-frequency band (1 Hz-10kHz); (ii) Low-frequency band (100 Hz-1 Hz); 
(iii) Very-low-frequency band (1 nHz-100nHz); (iv) Extremely-low-frequency band 
(1 aHz-1 fHz). This classification was mainly according to frequency ranges of corre- 
sponding types of detectors/detection methods: (i) ground GW detectors; (ii) space 
GW detectors; (iii) pulsar timing method and (iv) cosmic microwave background 
(CMB) methods. In 1997, we followed Ref. 9 and extended the band ranges to give 
the following classification!®!!: 


(i) High-frequency band (1-10kHz). 

(ii) Low-frequency band (100 nHz-1 Hz). 
(iii) Very-low-frequency band (300 pHz-100 nHz). 
(iv) Extremely-low-frequency band (1 aHz-~10 fHz). 


Subsequently, we added the very-high-frequency band and the middle-frequency 
band for there were enhanced interests and activities in these bands. Recently, we 
added the missing band (10 fHz-300pHz) and the two bands beyond to give a 
complete frequency classification of GWs as compiled in Table 1.!2~-17 

In Sec. 2, we give a brief introduction to GWs in GR. In Sec. 3, we review various 
methods of detection together with their typical/aimed sensitivities. In Sec. 4, we 
review various astrophysical and cosmological sources. In Sec. 5, we present an 


outlook. 
Table 1. Frequency classification of GWs.!°— 17 

Frequency band Detection method 
Ultra-high frequency band: Terahertz resonators, optical resonators 

above 1 THz and magnetic conversion detectors. 
Very-high-frequency band: Microwave resonator/wave guide detectors, laser 

100 kHz-1 THz interferometers and Gaussian beam detectors. 
High-frequency band (audio band)*: Low-temperature resonators and 

10 Hz-100 kHz ground-based laser-interferometric detectors. 
Middle frequency band: Space laser-interferometric detectors of 

0.1 Hz-10 Hz arm length 1,000 km—60,000 km. 
Low-frequency band (milli-Hz band)?: Space laser-interferometric detectors of arm 

100 nHz—0.1 Hz length longer than 60,000 km. 


Very-low-frequency band (nano-Hz band): Pulsar timing arrays (PTAs). 
300 pHz-100 nHz 
Ultra low-frequency band: Astrometry of quasar proper motions. 
10 fHz—300 pHz 
Extremely-low (Hubble)-frequency band CMB experiments. 
(cosmological band): 1 aHz—10 fHz 
Beyond Hubble-frequency band: Through the verifications of inflationary/ 
below 1 aHz primordial cosmological models. 


Notes: *The range of audio band normally goes only to 10 kHz. 
+The range of milli-Hz band is 0.1 mHz—-100 mHz. 
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2. GWs in GR 
The equations of motion of GR, i.e. the Einstein equation is 
Gav = KD, (1) 


where T;,, is the stress-energy tensor and & = 87Gwy. (We use the MTW!3® conven- 
tions with signature —2; this is also the convention used in Ref. 19; Greek indices 
run from 0 to 3; Latin indices run from 1 to 3; the cosmological constant is negligible 
for treating the methods of GW detection and for evaluating GW sources at the 
aimed accuracy of this paper except in the Hubble frequency band and beyond the 
Hubble frequency band and will be neglected in this treatment except in association 
with cosmological models.). Contracting the equations of motion (1), we have 


R= -8nGyT, (2) 
where T = T),“. Substituting (2) into (1), we obtain the following equivalent equa- 
tions of motion 

Ruy = 81Gy Ln. — som]. (3) 
For weak field in the quasi-Minkowskian coordinates, we express the metric gag as 
Jap = Nop + hos, hag <1. (4) 


Since hag is a small quantity, we expand everything in hag and linearize the results 
to obtain the linear approximation. For linearized quantities, we use the Minkowski 
metric jag to raise and lower indices without affecting the linearized results. The 
Riemann curvature tensor can be expressed as 

1 ; 
Raps = 3 (Gas 184 + 9By 105 — Jay:B6 — JBd;a7 ) ae Juv eae a PM agl ay)s (5) 
After linearization, we have 


1 
Rapys = 5 (hadsay + hpy.05 —hor86 — assay) + O(h*), (6) 
al B B B B 2 
Ray = 3 (hase + heya —Neeyg —he 1U4ey ) + O(h ), (7) 
R = hap,?? —hg?,.% + O(h?), (8) 


where O(h?) denotes terms of order of haghyv or smaller. Now we choose the 
harmonic gauge condition for hag, 


1 1 
hap — 5Naa(Tr | P=0+0(h*), ie has” = 5(Th),o + O(n’), (9) 


where Tr(h) is defined as the trace of h,°, ie. Tr(h) = h,%. Now the linearized 


Qa? 


Einstein equation can be derived from (3), (7) and (9) and written in the form: 


1 
ivan” = —l67Gy rp. _ 5oT] + O(h?). (10) 
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This is the linearized wave equation for GR. The corresponding equation for elec- 
tromagnetism is 


Ags Anil, (11) 
with gauge condition 
Ags 0: (12) 
The retarded solution of Eq. (12) is 
Ay = / th Ba! (13) 
Tretarded 
Analogously, the solution of equation for GR in the harmonic gauge is 
1 
4GNn Coe= tel 
bye = Ef {T= 3 au Px! + O(R?). (14) 
r retarded 


In the linearized scheme, it is useful to represent the solution hy,(x, y, z,t) out- 
side of the source region by its spectral components with wave vector (kz, ky, kz) and 
frequency f. First, find the Fourier transform hyv(kz, ky, kz) of hyv (2, y, 2, t)|t=0: 


k)p phn kig te Seo T Mew x,y, 2,0) exp(—ik,x — ikyy — ik,z)dxdydz. 
bh y im y 
(15) 


The integration is from —oco to oo for each integration variable. From Eq. (10), for 
each spectral components, the frequency f is given by the dispersion relation 


_— © 742 2 21/2 _ © 
= tke +k = : 1 
fo Zettai = 2s (16) 
Hence the solution is 
iG 3 
Few (@y Y, 2,8) = & i} vic ies tiles is) 
x exp(ik,@ + ikyy + ik, — Qnift)dk, dk, dk z, (17) 


with f given by (16). 
For plane GW hyp (n2t+nyyt+n,z—ct) propagating in the (nz, ny, nz) direction 
with n? + Ly +n? =1, letting 


U=u-ct=ngttnyy+nzz — ct, (18) 
we can resolve the plane GW into the following spectral representation: 


huv(u,t) = hu (u — ct) = hy (UV) 


cs Te (9 Ay (k) exp(iku) exp(—2mift) dk (19) 


~ On — 
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with 
hunk) =o f hogltth) exp (—thU Vat. (20) 


The plane wave (19) and (20) can also be written as 


hyn ist) = url et) = hye) =f Ohya thew (P=) apr) 


with 


On (f) = OS Fis (: — =i) = i hyv(u — ct)|u=0 exp(27ift)dt. (22) 


c —0°o 


We note that since h,,(t) is real, OA w(—f) = Oh,,*(f) with * the complex 
conjugation 


Cc 


= [° 22\ rg (fp) 00s (72) atin (23) 
0 G 


hy) = f * 21 hyv(f)] cos (=! a ) if 


From (21, 22) and the Parseval’s equality, we have 


fo Vrutae= fi APap 


oo —oo 


(No summations in the indices jz and v). (24) 


The squared-amplitude integral is equal to its squared-spectral-amplitude inte- 
gral. One can also obtain a similar identity relating the integral on the (absolute) 
square of h(a, y,z,t) over (w,y,z) and the integral on the absolute square of 
(Our (kn, ky, kz) over (ke, ky, kz) using (15), (17) and the Parseval’s equality in 
three dimensions. 

For weak GW h,,, propagating in the spacetime background g,,, (i.e. the total 
spacetime metric is gv + hyv-), Isaacson?”?! showed that the GW stress—energy 
averaged over several wavelength is 


ren 
MY 327Gn 


(uh opo,h ??), (25) 


Here, h™,, is the transverse traceless part of hy . In the special harmonic gauge 
called radiation gauge (similar to radiation gauge in electrodynamics), h™* ,, = 
hyv. Far from the sources, the GW can be approximated by plane waves. For a 
wave propagating in the z-direction, the only nonvanishing components of h,,, in 
radiation gauge are hy, he2(= —hi1), hig and ho1(= hig). The mass-energy density 
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too and mass-energy flux cto3 are given by 


2 1 
pC toy = tg = : ( (2) (Ophir — Ooh22)” + (na) ) 


167Gy 
e 2 2 
~ TénGy, Ooh + Ooh: )) (26) 


in agreement with Ref. 22. Here, h4(= (1/2)(hi1 — ho2) = hiy = —he2) is the 
amplitude of e,-polarization (+-polarization); hy (= hig = hei) is the amplitude 
of e2-polarization (x-polarization). Due to gauge (coordinate) invariance from the 
linearized wave equation (10) in GR, for plane GW waves in the direction of z-axis, 
there are two polarizations e+ and ex: 


€, =22—-yY, €x =Lyt+ yx (27) 
with x and y the unit vectors in the directions of x-axis and y-axis. The product 
xa is tensor product. The metric tensor of e+-polarization GW is hie +; that of 
€y-polarization GW is h,e,. The total GW metric h is 


h=hye,+hyex; in component form, hyy =hpe+p+hxexpv. (28) 


GW with e,-polarization contributes to the first term of the energy density formula 
(26) with squared amplitude (09h+)?; GW with e,-polarization contributes to the 
second term of the energy density formula (26) with squared amplitude (Ooh,)?. 

Far from the GW sources as it is in the present experimental/observational situ- 
ations, the plane wave approximation is valid. Space averages can be replaced with 
time averages. For orthogonal modes, the energy can be added in quadrature. For 
multi-frequency plane GW, the total energy density in the spectral representation 
of (26) becomes then 


2 oo 
pe’ = too = oa) (2m)? f?[|PAL (AP? + |Ohx (FP laf 


—oco 


=f sna. (29) 


0 
{E) 5,(f) is defined as the (one-sided) energy spectral density of h and is given by 
2 2 
(E) = NE 22) 24 \(f) 2 niall 
Sof) = Fe POR CAHP +O (NP]= Ze AS(F) (80) 
with 
Sif) = 4FIORL(F)P + [hx (FP) = £7 Bel F))?s 


Sna(f) =4f|Pha(f)? = f7'(Acea(f))? for a single polarization A (A = +4, x), 
(31) 


the spectral power density of h and ha, respectively and 


he(f) = 2f (IPRA? + lORx(APL?;  hea(f) = 2f|Oha(f)| (31a) 
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the characteristic strains [comparing with Eq. (23)]. For unpolarized GWs, 
|Ong (A)? = |Ohx (Ff)? and we have 


(©) 5, (f) a ee 


2 
= af Sna(f). (32) 
N 


From (30), the energy density is proportional to h4 for a particular polarization. 
General GW can be resolved into superposition of plane GWs, the formula (29)—(32) 
are still applicable. For an early motivation and an in-step mathematical derivation, 
see, e.g. Refs. 23 and 24, respectively. 

For background or foreground stochastic GWs, it is common to use the critical 
density p. for closing the universe as fiducial: 


_ 3HR 
~ 87Gn 


Pc = 1,878 «10-" e/a", (33) 
where Ho is the Hubble constant at present. Throughout this article, we use the 
Planck 2015 value 67.8 (+0.9) km s~'Mpc~! for Ho.?° In the cosmological context, 
it is more convenient to define a normalized GW spectral energy density Qg(f) 
and express the GW spectral energy density in terms of the energy density per 
logarithmic frequency interval divided by the cosmic closure density p- for a cosmic 
GW sources or background, i.e. 


_faplf)_ 7 43 Sif) _ 7 
Qew(f) = De af ~~ sat 3He ~ gael SHS) 
(7) 
(- ora Snald) for unpolarized aw). (34) 


For the very-low-frequency band, the ultra-low-frequency band and the 
extremely-low-frequency band, this is a common choice. 

From Eq. (14), one can derive the quadrupole formulas of the gravitational 
radiation metric and the radiated power at the lowest approximation??: 


2GN ot 
hij t, 19) = — ’ 35 
il il *) | or dt? retarded ( 
dP Gy [Qiu ]° 
i> Ba (aa (28 


Here, dP/dQ is the power radiated into the solid angle dQ in the polarization 
ij; Qij(= J pxi.xj;d3x) is the moment of inertia of the radiating system and e;; is 


GWs: Classification, methods of detection, sensitivities and sources 1-469 


the polarization of the emitted GW. Summed over two polarizations and integrated 
over solid angles, the total power emitted is 
Gy (POy PQy 1 Oe eos 
“se | ae ae 38 ae a I 
Inserting the moment of inertia of the binary Keplerian orbit motion into 
Eq. (37) and average over one orbit period, Peters and Mathews”° obtained the 
following formula for the gravitational radiation loss: 
_ 32GR, M?M2(M + M2) E 73 02 me (38) 
5 a(1 — e)7/2 24° ' 96° |’ 
Here, M;, and Mg are the two masses of the binary, e is the eccentricity of the 
elliptic orbit, and a the semi-major axis. Peters?’ further obtained the average 
angular momentum emission rate: 


dL 32G4? M2M3(Mi + Mz)!/? [| 
( dt ) ~ 58 a™/2(1 — e?)? c "Bo |: ee) 
From the Peters-Mathews radiation formula (38) and Peters’ angular momenta 
radiation formula (39), the orbital period P, decay rate can be calculated as?” 


dP, 1927 (2 ) rate 


(37) 


(P) 


dt 5 


~ 21 


24 96 
From (39) and (40) Peters obtained the time evolution equations for (da/dt) and 
(de/dt), and found the time dependence of the semi-major axis a(t) and the merging 
time T.(a9) for circular orbits starting from initial semi-major axis a = ao: 


— (74 _ 1/4. = ay : = 64 
a(t) = (ag — 46t)"/";  T-(ao) = re with B= . 


x c pes ze (1 — e)"/?My Mo(My + Mo)~/8. (40) 


G3 
= M,M2(M, + M2) 
(41) 


in reasonable agreement with estimates from higher-order approximations and 
results from numerical relativity. 

For a binary system of masses M, and M2 with Schwarzschild radius R; and 
Re, the strain h calculated from (35) of its emitted gravitational radiation is of the 
order of 


Dd ’ 42) 


where d is the distance between M, and Mz, D the distance to the observer. For 
neutron star or black hole, d can be of the order of Schwarzschild radius and the 
estimation can be simplified: 
R 
h< D: (43) 
For black hole of solar masses, R = 3km and d = 10° Ly., h < 3x 1077!; for inspiral 
of neutron star binaries, the GW strain generated is smaller. 
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GWs in GR have two independent polarizations. GWs in a general metric theory 
of gravity can have up to six independent polarizations according to the Riemann 
tensor classification of Eardley et al.?°; in terms of helicity, there are two scalar 
modes, one helicity +2 mode, one helicity +1 mode, one helicity —1 mode and 
one helicity —2 mode.?® Therefore, by measuring the GW polarizations, different 
theories can be distinguished and tested. For a general metric theory with addi- 
tional fields (scalar, vector, etc.), there are monopole and/or dipole contributions 
to the quadrupole radiation formula (36). However, due to conservation of mass 
and conservation of linear momentum in the Newtonian order, the leading order 
of monopole and dipole contributions are of the same order or less compared with 
the quardrupole contribution in GR.°° Nevertheless, experiments/observations do 
distinguish them. For example, pulse timing observations on the relativistic pulsar- 
white dwarf binary PSR J1738 + 0333 have given stringent tests on some of these 
theories already.® 


3. Methods of GW Detection, and Their Sensitivities 


Similar to the frequency classification of electromagnetic waves to radio wave, mil- 
limeter wave, infrared, optical, ultraviolet, X-ray and y-ray etc., in Table 1, we have 
compiled a complete frequency classification of GWs. This classification together 
with the current and aimed sensitivities of various detection methods plus pre- 
dicted GW source strengths are plotted in Figs. 1-4. Figure 1 shows the spectrum 
classification together with detection methods and projects. Figs. 2-4 show respec- 
tively the characteristic strain h, versus frequency plot, the strain power spectral 
density (psd) amplitude [S;,(f)]!/? versus frequency plot and the normalized GW 
spectral energy density Q, versus frequency plots for various GW detectors and 
sources. Detailed accounts and explanations of Figs. 2-4 are given in the following 
subsections and in Sec. 4. 

For the methods of detecting GWs, we first classify them into real-time detection 
and imprint detection. For real-time detection, we use the time scale of 100 year — 
the life span of a human being. Although this scale could be extended, it is at least 
good for next few 100 years. Above 300 pHz [~ (100 yr)~1], real-time detections 
are possible. These detections include using resonators, interferometers and pulsar 
timing for detection in the first six GW bands in Table 1. Below 300pHz, the 
detections are possible on GW imprints. Imprint (or snapshot) detections include 
(i) using the method of quasar astrometry for detection in the ultra-low-frequency 
GW band, (ii) using CMB observations for detection in Hubble frequency GW band 
and (iii) using indirect verifications of primordial (inflationary or noninflationary) 
cosmological models beyond the Hubble frequency band. 

There are basically two kinds of GW detectors for real-time detection — (i) 
the resonant type: GW induces resonances in detectors (metallic bars, metal- 
lic spheres, resonant cavities...) to enhance sensitivities; (ii) detectors measuring 
distance change using microwave/laser/X-ray/atom/molecule... between/among 
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Infra- ELF ULF VLF |LF LF LE LS LE LF] MF HF UHF 
Hubble| (Hubble) AdLIGO 
€ Quasar CSDT AdVirgo Cavity/waveguide > 
CMB Astrometry ASTROD-GW KAGRA Laser interferometer 
eLISA/LISA ET Gauss beam 


10-18 10-14 107 10% 10! 105 101? Hz 


* AIGO, AURIGA, EXPLORER, GEO, NAUTILUS, MiniGRAIL, Schenberg. 

+ OMEGA, gLISA/GEOGRAWI, GADFLI, TIANQIN, ASTROD-EM, LAGRANGE, ALIA, 
ALIA-descope. 

+ EPTA, NANOGrav, PPTA, IPTA. 


Fig. 1. The GW Spectrum Classification (updated from Refs. 16 and 17). 


suspended /floating test bodies. In the case of PTAs for detection in the very-low- 
frequency GW band, the floating test bodies are the pulsars and observatories while 
the relative distance change are through pulsar timing variations. Two crucial issues 
in real-time GW detection are (i) to lower disturbance effects and/or to model 
the residuals: suspension isolation, drag-free to decrease the effects of surround- 
ing disturbances and appropriate modeling of the motion and the disturbances to 
reduce the uncertainties in the measurement; (ii) to increase measurement sensi- 
tivity: capacitive sensing, microwave sensing, SQUID transducing, optical sensing, 
X-ray sensing, atom sensing, molecule sensing and timing.... 


3.1. Sensitivities 


The input and output of a detector are scalar quantities. The input of a GW 
detector is a time series h(t) of GW signals which can be written as a functional of 
the GW metric hog(x’,t’). For weak GW as in most situations, this functional can 
be linearized and approximated by a linear functional D of hag(a’,t’): 


hG) = DGiaa 7 )). (44) 


For a stationary local detector, D may further be reduced to a constant tensor D°? 
such that 


h(t) = D* hag(a, t). (45) 
In a transverse, traceless coordinate gauge, h(t) is further reduced to 
h(t) = D4h™7 ,; (a, t). (46) 


D*) (or D®®) is called the detector tensor which depends on detection geometry. 
As an example, for a GW interferometer oriented with two arms on the x-axis and 
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Fig. 2. Characteristic strain he versus frequency for various GW detectors and sources. [QA: Quasar Astrometry; QAG: Quasar Astrometry Goal; 


LVC: LIGO-Virgo Constraints; CSDT: Cassini Spacecraft Doppler Tracking; SMBH-GWB: Supermassive Black Hole-GW Background.] (For 
version, see page I-CP8.) 
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Fig. 3. Strain psd amplitude versus frequency for various GW detectors and GW sources. See Fig. 2 caption for the meaning of various acronyms. 
(For color version, see page I-CP9.) 
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y-axis with nearly equal arm lengths in the long wavelength limit, the detector 
tensor has D1! = 1/2, D?? = —1/2 and all other components vanishing; we have 


h(t) = hir(t), A(f) = Ohi (f) = Ohs (f) and hes (f) = 2f|Oh4 (fF); 
In the case of linear response, the detector output (f) plot) f) is related to the 
input by 


OHO (F) = T(f) x ORG), 
(or simply h&")(f) = T(f) x h(f) without heavy notations), 
(47) 


where T(f) is the transfer function or the response function of the detector. In the 
output of any detector there will be noise also. The total output s‘“*)(t) is the 
addition of the GW signal output h(°"*)(t) and the noise output n°") (t): 


si") (¢) = Alor) (t) + nm") (4); (48) 
in frequency space the total output “)s"*)(f) is 
gman (fF) — Cy oe (Fy a Pg ah 7), (49) 
Habitually, it is convenient to refer and compare noise at the input port by defining 
On(f) = (P(A? x One (fF), (50) 
From (50), we have 
sl (f) = (TP)] x (PAF) + Onc], (51) 
and we can define 
sf) = (FA) x Ps (fF) = Oa(f) + On(f) (52) 
to be the total input signal. In time domain, we have then 
s(t) = h(t) + n(t). (53) 


It is convenient to take n(t) as the detector noise. 
It is also convenient and practical to assume that the detector noise is stationary 
and Gaussian, and the different Fourier components are independent. We then have 


(One (g) mcg) = (5) oF = F801. (54) 


where S',(f) is defined by the equation. From this equation, one can derive 
(n2(®) = (ne =0)) = f afar nx Qnty = fo af Saf). 65) 
—oo 0 


Hence, S;,(f) is called the noise power spectrum, the noise power spectrum density 
(noise psd), the noise spectral density, or the noise spectral sensitivity. It is one-sided 
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since the integration only takes on the positive axis. For a more detailed derivation 
of Eq. (55), we refer the reader to Refs. 24 and 31. For a dimensionless description 
of noise power at a particular frequency, one usually use noise amplitude h,,(/) 
which is defined as 


half) = (fSa(f)/. (56) 


For comparison, we note that for a GW interferometer oriented with two arms ori- 
ented on the z-axis and y-axis with nearly equal arm lengths in the long wavelength 
limit, the GW signal h(t) = hii(t) = hy(t)(MA(F) = Phu (f) = OAL(f)) with 
GW propagating perpendicular to xy-plane corresponds to detector geometry with 
Dyv = €+pv- Its associated strain psd S);,(f) is 


Saf) = 4f|OACA)? = [2f/? hu (Af)? 


=pPern (r= veil Sf neil. 
(57) 


In general, GW detector has different geometric sensitivity to monochromatic 
GW coming from different directions and with different polarization. Hence each 
detector has its own pattern function of directions and polarizations. In plotting 
GW sensitivities, one usually takes average over directions and polarizations for a 
detector. 

In the discussion of GW sensitivities and GW signal strengths, there are three 
customary ways to plot: characteristic strain h.(f) versus frequency f, square-root 
psd [S;,(f)]!/? versus frequency f and the normalized GW spectral energy density 
Qew versus frequency f. From (57), for this case, we define the (dimensionless) 
characteristic strain for the singal h(f) as in (31a) 


he(f) = 2f|har(f)| = 2f/Oh4 (A). (58) 
With this definition, we have from (57) and (58) 
Sr(f) = fhe AYP = FAO Au (F)P = FOAL CAP. (59) 


Now we relate the three quantities he(f), [Sn(f)|!/* and Qew(f) using Eqs. (34) 
and (58): 


Qn? 


gael Saf): hel) = FP1SAAD?. (60) 


Quw(f = 
Table 2 compiles the conversion factors among the characteristic strain h,.(f), the 
strain psd [S;,(f)|!/? and the normalized spectral energy density Qew(f). In using 
(60) and Table 2, especially the conversion to Qew(f), we assume a baseline detec- 
tor and source configuration just mentioned. For other configuration, its specific 
detector geometry, source geometry and GW polarization need to be taken care of. 
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Table 2. Conversion factors among the characteristic strain h¢(f), the strain psd [S,(f)]!/2 and 
the normalized spectral energy density Qgw(f). 


Characteristic Strain psd [S;,(f)]!/? Normalized spectral 
strain hc(f) energy density Qew(f) 
he(f) he(f) PPS A)? [(3HG/2n? f?)Qgw(f)]/? 
Strain psd [Sp(f)J!/?_ f~\/?he(f) [Sa(f]i/? [(3Hg /20? f3)Qew(f)]/? 
Qew(f) (20? /3H5)f7h3(f) (20? /3HG) f° Sn(f) Ogw(f) 


In data analysis, the optimal signal-to-noise ratio 7 that can be obtained is by 
using Wiener matched filter h(f)/Sn(f): 


o_ fp 4lPaA? 
: = circa a id 

Using Eqs. (58) and (56), (61) can be written as 

of apet [Bef] f? arto fy | Bef] 

rm far] = {acer |] we 


Hence in the log—log plot of characteristic strain versus frequency, the square of 


signal-to-noise ratio is equal to the integral of the square of the ratio of characteristic 
strain of source over the characteristic noise strain. For large signal-to-noise ratio, 
it is approximately equal to the area between the characteristic strain curve and 
the characteristic noise strain curve in the detection bandwidth. 

For the parameter fitting, the more discernable the structure, the better are the 
parameters fitted. Please see Refs. 32-35 for good accounts. 


3.2. Very high frequency band (100kHz—1 THz) and ultrahigh 
frequency band (above 1THz) 


In the very high frequency band (100 kHz-1 THz), there are two experiments com- 
pleted. A cavity /waveguide detector, where the polarization of electromagnetic wave 
changes its direction under incoming GW, was operated at 100 MHz and gave upper 
limit of the background GW radiation of around 10~'4 Hz~'/?.36 And a0.75m arm 
length laser interferometer, where synchronous amplification of the phase shift due 
to GW occurs, achieved a noise level limiting the existence of 100 MHz background 
GW down to 107!6 Hz~!/?.37:38 These two upper limits are marked on Fig. 3 and 
the corresponding limits on Figs. 2 and 4. 

Cruise has described two types of magnetic conversion prototype detectors A 
and D being commissioned at Birmingham in Sec. 9 of Ref. 39. The basic principle 
of a magnetic conversion detector is to convert GWs in a laboratory magnetic field 
to electromagnetic waves which are then focused or concentrated on one detector 
element to be measured. Detector A has a room temperature microwave receiver 
sensing the waveguide conversion volume with a magnetic field 0.2 T. The expected 
sensitivity curve of prototype detector A for one year integration with 1 MHz band- 
width is shown in Fig. 1 of Ref. 39 as curve A. In the same figure, curve B shows 
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the expected sensitivity for a larger detector, 60 x 500 x 800mm? having a 20K 
noise temperature amplifier and curve C shows that for a pair of such detectors in 
correlation over a one year period. Curve D is for the detector with the cooled CCD 
sensing the same waveguide and field as A, also for a one year integration. 

We put curves A, B, C and D from Fig. 1 of Ref. 39 on Fig. 3 in this section. The 
definition of Qgw(f) in Eq. (1) of Ref. 39 is the same as ours; while their conversion 
to their h Eq. (2) of Ref. 39 is 


100\°/? 
h=5.8 x 107? (+) (eff. (63) 
Our strain psd [S;,(f)]'/? using Planck Ho”° is 


7 1/2 3/2 
sit? = [Pee O(N] = 857 x10 (AP) eis)" 


(64) 


Therefore, their h is basically our [S),(f)|'/? with a multiplicative factor 0.677. 
Hence, Fig. 1 of Ref. 39 corresponds to our Fig. 3 basically. We adjust the factor 
0.677 for the nucleosynthesis limit in Fig. 4 and corresponding places in Figs. 2 
and 3. We have not adjusted other parts of Fig. 1 of Ref. 39 while transport to our 
figures since the multiplicative factor is not large in our log-log plots. 

Curve C and curve D have sensitivities in strain psd [S;,(f)]!/? close to 1 
for frequencies around 10'° Hz in the very high frequency band and 10!° Hz in the 
ultrahigh frequency band, respectively. The corresponding curves are also shown in 
Figs. 2 and 4. Possible sensitivity enhancements have been suggested by generating 
electromagnetic power depending linearly on the GW amplitudes;*? however, the 
associated noise issues are still pending on solutions.*!:°° Signal amplitudes from 
various GW sources are summarized in Sec. 4.6. 


0720 


3.3. High frequency band (10 Hz—100 kHz) 


Most of the current activities of GW detection on the ground or in the underground 
are in the high frequency band. In the following, we summarize the activities and 
sensitivities. For a detailed exposition, we refer to Ref. 42. 

In this band, the cryogenic resonant bar detectors have already reached a strain 
spectral sensitivity of 10-2! Hz—!/? in the kHz region. NAUTILUS put an upper 
limit on periodic sources ranging from 3.4 x 107?% to 1.3 x 1072? depending on 
frequency in their all-sky search.** The AURIGA-EXPLORER-NAUTILUS-Virgo 
Collaboration applied a methodology to the search for coincident burst excitations 
over a 24h long joint data set.44 The MiniGRAIL* and Schenberg*® cryogenic 
spherical GW detectors are for omnidirectional GW detection. 

Major detection efforts in the high frequency band are in the long arm laser 
interferometers. The TAMA 300m arm length interferometer,*’ the GEO 600m 
interferometer,4* and the kilometer size laser-interferometric GW detectors — 
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LIGO*® (two 4km arm length, one 2km arm length) and VIRGO*® all achieved 
their original sensitivity goals basically. Around the frequency 100Hz, the LIGO 
and Virgo sensitivities are both in the level of 10-78 (Hz)~!/?. The LIGO and Virgo 
achieved sensitivity curves are shown in Figs. 2-4.49-°° Interference spikes are taken 
out for clarity in the presentation in these figures. Various limits on the GW strains 
for different sources become significant. For example, analyses of data from S6 (sixth 
science run) of LIGO and GEOQ600 GW detectors and VSR 2 and VSR 4 (Virgo 
science runs) of Virgo detector set strain upper limits on the GW emission from 195 
radio pulsars; specifically, the strain upper limit on the Vela pulsar is comparable 
to the spin-down limit and that on the Crab pulsar is about a factor of 2 below 
the spin-down limit.°! The 2009 analysis of the data from a LIGO two-year science 
run constrained the normalized spectral energy density Q,.(f) of the stochastic 
GW background in the frequency band around 100Hz, to be 6.9 x 10~® at 95% 
confidence.®” This search for the stochastic background improved on the indirect 
limit from the Big Bang nucleosynthesis at 100 Hz. In 2014 further improvement 
and refinement on the limit of the stochastic GW background were obtained from 
the analysis of the 2009-2010 LIGO and Virgo Data.°? Assuming a stochastic GW 
spectrum of 


Qew(f) = Qa (4) (65) 


LIGO and Virgo collaboration placed 95% confidence level upper limits on the 
normalized spectral energy density of the background in each of four frequency 
bands spanning 41.5—1726 Hz: 


Of) <56% 10°, for the frequency band 41.5—169.25 Hz; 
Caf) aes 10, for the frequency band 170-600 Hz; 
ae (66) 
Qew(f) < 0.14 (sod) , for the frequency band 600-1000 Hz; 
Z 
¢ 3 
Qew(f) < 1.0 (sea) , for the frequency band 1000-1726 Hz. 


These constraints [LVC: LIGO-Virgo Constraints] are plotted on Fig. 4 and the 
corresponding constraints on Figs. 2 and 3. 

Also in the analysis of jointly conducted science runs (LIGO S6 and Virgo VSR 
2 and VSR 3), two kinds of search were done for possible GWs associated with 
154 gamma ray bursts that were observed by satellite experiments in 2009-2010: 
the first search is for a signal from coalescence of two neutron stars or a neutron 
star and a black hole and the second search is for a burst-like GW signal from the 
collapse of a massive star. No signals were detected. This results places limits of 
17 Mpc for no collapsing star, 16 Mpc for nonexistence of the coalescence of binary 
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neutron stars and 28 Mpc for that of a neutron star and a black hole associated 
with the observed 154 gamma ray bursts.°* 

Observations by all above long baseline laser interferometers have finished their 
first phase operation by 2010. Sensitivity improvement of one order of magnitude 
is underway by upgrading LIGO and Virgo as advanced interferometers, adLIGO* 
and adVirgo®® and by initiating a new project, KAGRA/LCGT.*’ This improve- 
ment will increase the detection volume by three orders of magnitudes. These GW 
detectors are the second-generation interferometers. The advanced LIGO is the 
earliest started and has achieved 3.5 fold better sensitivity improvement already; 
it began its first observing run (O1) on September 18, 2015 searching for GWs. 
We plot the February-2015 achieved adLIGO sensitivity together with the planned 
strain sensitivities of adLIGO,®° adVirgo*® and KAGRA®’ on Figs. 2-4. KAGRA 
will be a cryogenic underground interferometer with 3 km arm length; it will already 
have some features of the third generation GW interferometers. ET (Einstein 
Telescope)*® is a third generation GW interferometer. It will be a cryogenic under- 
ground interferometer with 10km arm length. Its goal sensitivity is also plotted on 
Figs. 2-4. 

As to the upper range of this band, it is noticed that every free spectral range 
(FSR) relative to the lock point, there would be good sensitivity. The FSRs of LIGO 
and VIRGO/KAGRA are 37.5 MHz and 50 MHz. LIGO is considering/discussing 
this frequency. Although digitation under 100 kHz is not a technological feasibility 
problem, it is a practical problem in sampling/digitizing the data at these high 
frequencies. Nevertheless, the upper range of the high frequency band is accessible 
to the km-sized GW interferometers. 


3.4. Doppler tracking of spacecraft (1 wHz—1 mHz in the 
low-frequency band) 


Doppler tracking of spacecraft can be used to constrain (or detect) the level of low- 
frequency GWs.°? The separated test masses of this GW detector are the Doppler 
tracking radio antenna on Earth and a distant spacecraft. Doppler tracking mea- 
sures relative distance-change. Estabrook and Walquist derived? the effect of GWs 
passing through the line of sight of spacecraft on the Doppler tracking frequency 
measurements (see also Ref. 60). From these measurements, GWs can be detected 
or constrained. The most recent measurements came from the Cassini spacecraft 
Doppler tracking (CSDT). Armstrong et al.°! used precision Doppler tracking of 
the Cassini spacecraft during its 2001-2002 solar opposition to derive improved 
observational limits on an isotropic background of low-frequency GW. They used 
the Cassini multilink radio system and an advanced tropospheric calibration system 
to remove the effects of leading noises — plasma and tropospheric scintillation to a 
level below the other noises. The resulting data were used to construct upper limits 
on the strength of an isotropic background in the 1 ~Hz-1mHz band.®! The char- 
acteristic strain upper limit curve labelled CSDT in Fig. 2 is a smoothed version 
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of the curve in Fig. 4 of Ref. 61. The corresponding CSDT curves on the strain 
psd amplitude in Fig. 3 and the normalized spectral energy density in Fig. 4 are 
calculated using Table 2 for conversion. The minimal points on these curves are 


il) 22<e10-*, at frequency about 0.3 mHz; 
iS. <exi0-” Hz '/?, at several frequencies in the 0.2-0.7mHz band; 


Qew(f) < 0.03, at frequency 1.2 Hz. (67) 


The GW sensitivity of spacecraft Doppler tracking could still be improved by 
1-2 order of magnitude with a space borne optical clock on board.®? 

The basic principle of spacecraft Doppler tracking, of spacecraft laser ranging, 
of space laser interferometer and of PTAs for GW detection are similar. In the 
development of further GW detection methods, spacecraft Doppler tracking method 
has stimulated significant inspirations. ASTROD I (Astrodynamical Space Test 
of Relativity Using Optical Devices I)®? using a space borne precision clock has 
included as one of its goals GW sensitivity improvement of the CSDT by one order 
of magnitude. The methods using space laser interferometers and using PTAs are 
two important methods of detecting GWs; their sensitivities will be discussed in 
Secs. 3.5 and 3.6, respectively. 


3.5. Space interferometers (low-frequency band, 100 nHz—100 mHz; 
middle-frequency band, 100 mHz-10 Hz) 


Space laser interferometers for GW detection (eLISA/LISA,°+® ASTROD,%°° 
ASTROD-GW,15—16.68:69 ASTROD-EM,°:° Supner-ASTROD,’! DECIGO,” Big 
Bang  Observer,“2 =ALIA,“4 = ALIA-descope,”  — gL ISA/GEOGRAWI,°—8 
GADFLI,” LAGRANGE,®° OMEGA,®! and TIANQIN®?) hold the most promise 
with high signal-to-noise ratio. Laser Interferometer Space Antenna (LISA)® is 
aimed at detection of 10-4-1 Hz GWs with a strain sensitivity of 4 x 10~?!/(Hz)!/? 
at 1mHz. There are abundant sources for eLISA/LISA, ASTROD, ASTROD-GW 
and Earth-orbiting missions: (i) In our Galaxy: galactic binaries (neutron stars, 
white dwarfs, etc.); (ii) Extra-galactic targets: supermassive black hole binaries, 
supermassive black hole formation and (iii) Cosmic GW background. A date of 
launch of eLISA or substitute mission is set around 2034.* 

Early in 2009, responding to the call for GW mission studies of Chinese Academy 
of Sciences (CAS), a dedicated GW mission concept ASTROD-GW with 3 S/C 
(spacecraft) orbiting near Sun—Earth Lagrange points L3, L4 and L5, respectively 
was proposed and studied. Before that, Super-ASTROD which was proposed in 
1997/° with S/C in Jupiter-like orbits was studied as a dual mission for GW mea- 
surement and for cosmological model/relativistic gravity test." With the proposal 
of ASTROD-GW, the baseline GW configuration of Super-ASTROD takes 3 $/C 
orbiting respectively near Sun—Jupiter Lagrange points L3, L4 and L5. For the pos- 
sibility of a down scaled version of ASTROD-GW mission, the ASTROD-EM with 
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the orbits of 3 S/C near Earth-Moon Lagrange points L3, L4 and L5, respectively 
has been under study.” 

DECi-hertz Interferometer GW Observatory (DECIGO)” was proposed in 2001 
with the aim of detecting GWs from early universe in the observation band (the 
middle frequency band) between the terrestrial band and the band of low-frequency 
space GW detectors. It will use a Fabry-Perot method (instead of a delay line 
method) as in the ground interferometers but with a 1000km arm length. As a 
LISA follow-on, BBO (Big Bang Observer)" was proposed in the United States 
with a similar goal. A likely version of DECIGO/BBO is to have 12 $/C in LISA- 
like orbits with correlated detection. They will be used for the direct measurement 
of the stochastic GW background by correlation analysis.? 6S/C-ASTROD-GW 
has also been considered to possibly explore the relic GWs in the lower part of the 
low-frequency band. ALIA was proposed as a less-ambitious LISA follow-on. A 
de-scoped ALIA” has also been proposed and under study. 

After the end in 2011 of ESA-NASA partnership for flying LISA, NASA solicited 
“Concepts for the NASA Gravitational Wave Mission” proposals on September 27, 
2011 for study of low cost GW missions (http://nspires.nasaprs.com/external/). 
gLISA/GEOGRAWI® “® (geosynchronous LISA/GEOstationary GRAvitational 
Wave Interferometer), GADFLI (Geostationary Antenna for Disturbance-Free 
Laser Interferometer), and LAGRANGE®*® (Laser Gravitational-wave Antenna at 
Geo-lunar Lagrange points) were proposed; OMEGA®*! (Orbiting Medium Explorer 
for Gravitational Astronomy) re-emerged. OMEGA was first proposed as a low-cost 
alternative to LISA in the 1990s. In China, a GW mission in Earth orbit called 
TIANQIN®? has been proposed and under study. 

Table 3 lists the orbit configuration, arm length, orbit period and S/C number 
of various GW space mission proposals. 

Typical frequency sensitivity spectrum of strain for space GW detection consists 
of three regions (Fig. 3), the acceleration noise region, the shot noise (flat for current 
space detector projects like LISA in strain psd) region, if any, and the antenna 
response region. The lower frequency region for the detector sensitivity is dominated 
by vibration, acceleration noise or gravity-gradient noise. The higher frequency part 
of the detector sensitivity is restricted by antenna response (or storage time). In a 
power-limited design, sometimes there is a middle flat region in which the sensitivity 
is limited by the photon shot noise.!0-6.84 

The shot noise sensitivity limit in the strain for GW detection is inversely pro- 
portional to P!/?] with P the received power and | the distance. Since P is inversely 
proportional to I? and P!/?1 is constant, this sensitivity limit is independent of the 
distance. For 1-2 W emitting power, the limit is around 107?! Hz~!/?. As noted 
in the LISA study,® making the arms longer shifts the time-integrated sensitiv- 
ity curve to lower frequencies while leaving the bottom of the curve at the same 
level. Hence, ASTROD-GW with longer arm length has better sensitivity at lower 
frequency. eLISA and GW interferometers in Earth orbit have shorter arms and 
therefore have better sensitivities at higher frequency. 


Table 3. A compilation of GW mission proposals. 


Mission concept 8/C configuration Arm length Orbit period S/C # 

Solar-Orbit GW Mission Proposals 
LISA® Earth-like solar orbits with 20° lag 5 Gm 1 year 3 
eLISA®4 Earth-like solar orbits with 10° lag 1 Gm 1 year 3 
ASTROD-GW®8 Near Sun—Earth L3, L4, L5 points 260 Gm 1 year 3 
Big Bang Observer7? Earth-like solar orbits 0.05 Gm 1 year 12 
DECIGO?2 Earth-like solar orbits 0.001 Gm 1 year 12 
ALIA™4 Earth-like solar orbits 0.5 Gm 1 year 3 
ALIA-descope™” Earth-like solar orbits 3 Gm 1 year 3 
Super-ASTROD™! Near Sun—Jupiter L3, L4, L5 points (3 S/C), 1300 Gm 11 year 4or5 

Jupiter-like solar orbit(s)(1-2 S/C) 

Earth-Orbit GW Mission Proposals 
OMEGA®! 0.6 Gm height orbit 1 Gm 53.2 days 6 
gLISA/GEOGRAWI™~—78 Geostationary orbit 0.073 Gm 24 hours 3 
GADFLI? Geostationary orbit 0.073 Gm 24 hours 3 
TIANQIN®? 0.057 Gm height orbit 0.11 Gm 44 hours 3 
ASTROD-EM®70 Near Earth—Moon L3, L4, L5 points 0.66 Gm 27.3 days 3 
LAGRANGE®9 Near Earth—Moon L3, L4, L5 points 0.66 Gm 27.3 days 3 
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In Figs. 2-4, we plot sensitivity curves for LISA, eLISA and ASTROD-GW 
for the low-frequency GW band. In the Mock LISA Data Challenge (MLDC) 


program, the consensus goal for the LISA instrumental noise density amplitude 
MDL SC 4) i 


(MDLC) @1/2 1 fl : 
Shh (f) = = x 140.5 (+) x Sip 
L 


1/2 
4S, = 
+ [1+ (1074 Hz/f)?) cal ne (68) 
where Ly, = 5 x 10° m is the LISA arm length, fp = c/(27LZ,) is the LISA arm 
transfer frequency, S~p = 4 x 10~?2m?Hz~! is the LISA (white) position noise 
level due to photon shot noise and S, = 9 x 1078° m?s~4Hz~' is the LISA white 
acceleration noise (power) level.°° Note that (68) contains the “reddening” factor 
[1 + (10-4 Hz/f)?] in the acceleration noise term. 

If we drop the “reddening factor”, the enhanced LISA instrumental noise density 
amplitude (Enhanced) gt/? (f) becomes 


2 1/2 
(chenea) oY" (yr) = 7 x ‘( + 0.5 (+) x Sip + aa Hz 1? (69) 


The eLISA arm length Ley is five times shorter. Its instrumental noise density 
amplitude (MPLC)g1/?() is 


(MDLC) gl/2(¢)__ 1 y 1+05 ( f ) 
eLn Let, fet 


1/2 
A 2 4S. —1/2 
x Serp + [1+ (10 Hz/f) eat Hz ; (70) 
where Ler, = 10°m is the eLISA arm length, fer = c/(27Lei) is the eLISA 
arm transfer frequency, Se_p = 1 x 10-?2m?Hz~! is the eLISA (white) position 
noise level due to photon shot noise assuming that the telescope diameter is 25 cm 
(compared with 40cm for that of LISA) and that the laser power is the same as 


LISA. The corresponding enhanced eLISA instrumental noise density amplitude 
(Bnbanced) ¢1/2 ( ¢) is 


. 1/2 
Saal Gg) = = x ‘( + 0.5 G ) x Seip + an} Hz ?/2. (71) 
eL eL 


For ASTROD-GW, our goal on the instrumental strain noise density amplitude 


2 1/2 
sx? (f) = = x {(: + 0.5 (=) x Sap + an} Hz7/2, (72) 
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over the frequency range of 100nHz < f < 1Hz. Here, La = 260 x 10°m is the 
ASTROD-GW arm length, fa = c/(27La) is the ASTROD-GW arm transfer fre- 
quency, S, = 9 x 10-°° m?s~*Hz~! is the white acceleration noise level (the same 
as that for LISA) and Sa, = 10816 x 10-7? m? Hz"! is the (white) position noise 
level due to laser shot noise which is 2704 (= 527) times that for LISA.1%:14:16,68,86 
The corresponding noise curve for the ASTROD-GW instrumental noise density 
amplitude (MBLC) 4/2 ( f) with the same “reddening” factor as specified in MLDC 
program is 


2 1/2 
Sanh) = 7% ‘(2 +05 (4) )> Sap +[1 + (10-4/f)7] 228 Ha-¥? 


fa (2x f)4 
(73) 


over the frequency range of 100nHz < f < 1Hz. The sensitivity curves from the 
six formulas (68)—(73) are shown in Fig. 3. The corresponding sensitivity curves in 
terms of A, (f) and OQgw(f) are shown in Figs. 2 and 4, respectively. 

The LISA Pathfinder Mission has been shipped to Kourou and is scheduled 
for launch from Kourou Spaceport on Arianespace Flight VV06 on 2nd December 
2015. It is a technology demonstration mission. Its success will pave the road for 
future space GW missions. (Note added in proof: LISA Pathfinder was successfully 
launched on 3° December 2015.) 

The sensitivity curve of a single DECIGO interferometer as shown in Fig. 3 
is from Ref. 87. BBO has a similar single-interferometer sensitivity curve. One- 
sigma, power-law integrated sensitivity curve for BBO (BBO-corr) as shown in 
Fig. 3 is obtained by Thrane and Romano.*® That of DECIGO is similar. We also 
put in the plot their LISA autocorrelation measurement sensitivity curve (LISA- 
corr) in a single detector assuming perfect subtraction of instrumental noise and/or 
any unwanted astrophysical foreground.*® The minimum autocorrelation sensitivity 
using the same method for ASTROD-GW is also estimated and plotted in Fig. 3; 
this would also be the level that 6 S/C ASTROD-GW®8 (6 S/C ASTROD-GW- 
corr) could reach. For comparison, the one-sigma, power-law integrated sensitivity 
curve for the adLIGO H1L1 (adLIGO-corr) from Ref. 88 is also plotted in Fig. 3. 
All of the corresponding curves are plotted in Figs. 2 and 4. 

The development in atom interferometry is fast and promising. It already con- 
tributes to precision measurement and fundamental physics. A proposal using atom 
interferometry to detect GWs has been raised at Stanford University as an alternate 
method to LISA on the LISA bandwidth.®%-" Issues have arisen on its realization 
of LISA sensitivity.°'°? In Observatoire de Paris, SYRTE has started the first stage 
of its project — MIGA (Matter-wave laser Interferometric Gravitation Antenna)®* 
of building a 300m long optical cavity to interrogate atom interferometers at the 
underground laboratory LSBB in Rustrel. In the second stage of the project (2018— 
2023), MIGA will be dedicated to science runs and data analyses in order to probe 
the spatio-temporal structure of the local field of the LSBB region. In the meantime, 
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MIGA will assess future potential applications of atom interferometry to GW detec- 
tion in the middle-frequency band (0.1—10 Hz). 


3.6. Very-low-frequency band (300 pHz—100 nHz) 


When GWs are propagating across the line of sight of pulsar observations, the 
pulse arrival times are affected. This effect can be used to observe the GWs. For 
isotropic stochastic GW background, Hellings and Downs derived a formula on the 
correlation in the timing residuals as a function of pairs of pulsars and used it to 
constrain the energy density in GWs of frequency between 4—10 nHz to be less than 
1.4 x 1074 of the cosmic critical density in 1983.73 In 1996 and 2002, the upper 
limits from pulsar timing observations on a GW background derived by McHugh 
et al.°* and by Lommen®® are Qgw < 1077 in the frequency range 4-40 nHz, and 
Qew <4 x 107° at 6 x 107° Hz, respectively. 

Now there are 4 major PTAs: the European PTA (EPTA),°° the NANOGrav,?” 
the Parks PTA (PPTA)®® and the International (EPTA, NANOGrav and PPTA 
combined) PTA (IPTA).°? For recent reviews on pulsar timing and PTAs for GW 
detection, please see Refs. 100 and 101. These 4 PTAs have improved greatly on the 
sensitivity for GW detection recently.!0?-!°4 Upper limits on the isotropic stochas- 
tic background from EPTA, PPTA and NANOGrav are listed in Table 4. These 
limits assumes that the GW background has the following frequency dependence 
with a = —(2/3): 


he(f) = Ay: [f/(lyr*)]". (74) 


The most stringent limit is from Shannon et al.!°° using observations of millisecond 
pulsars from the Parks telescope to constrain Ay, to less than 1.0 x 10~1° with 95% 
confidence. This limit already excludes present and most recent model predictions 
with 91-99.7% probability.!°? The three experiments form a robust upper limit of 
1x 107° on Ay, at 95% confidence level ruling out most models of supermas- 
sive black hole formation. The limit is shown as constraint on the Supermassive 
Black Hole Binary GW Background (SBHB-GWB) in Fig. 2; the corresponding 
constraints are also shown in Figs. 3 and 4. Since more energy of GWs might be 
emitted with higher frequency in the hierarchy of supermassive black hole forma- 
tion, we extrapolate this constraint linearly instead with a knee using dotted line 
to 1 x 10-> Hz with some confidence. Constraints with other a values have similar 
order of magnitudes. 

To have an outlook of sensitivity of PTAs for the next hundred years, we adopt 
and extrapolate the estimates of Moore, Taylor and Gair.'°° The sensitivity for 
a monochromatic GW of a PTA is mainly dependent on the timing accuracies 
including timing residuals after modelling (rms deviations in timing residuals). The 
bandwidth depends on the sampling frequencies, i.e. cadences and the duration of 
the data span. For observations every At of time and an observation span of T' the 
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Table 4. Upper limits on the isotropic stochastic background from three PTAs. 


No. of pulsars No. of years Observation Constraint on characteristic strain 


included observed radio he(f)[= Ayr[f/(yr7!)]- 2/9), 
band [MHz] (f = 10-° — 10-7 Hz)] 
EPTA102 6 18 120-3000 Ayr <3 x 107) 
PPTA 103 4 11 3100 Ayr <1x%10-* 
NANOGrav104 27 9 327-2100 Ayr < 1.5 x 107) 


bandwidth f is [1/T, 1/At]. The frequency dependence of the sensitivity in h.(f) 
is linear in f: 


1 


hf) =Bye(fle), a<f<a (75) 


We assume By, is proportional to the timing accuracy, inversely proportional to 
the observation time span and inversely proportional to the number of pulsars in 
the PTA. In Moore et al.,‘°° a canonical PTA (MTG canonical PTA) with 36 
pulsars randomly distributed on the sky, observed every two weeks with a precision 
of 100ns over five years was assumed; this canonical PTA has a sensitivity (75) 
with By, = 4 x 107'° and is roughly equivalent to OPEN 1 mock dataset in the 
IPTA data challenge.!°° In Table 5, we compile the projected sensitivities for IPTA, 
FAST!°" and SKA!°® for an observation span of 20 years, 50 years and 100 years, 
respectively. To obtain a fiducial sensitivity of IPTA, we take the MTG canonical 
PTA,! but extend the observation time span to 20 years. The sensitivity is 1 x 
10-16 at f = yr-t = 3.17 x 10-5. With the advent of new and more sensitive 
observing facilities, PTA sensitivity will be improved. The 500m Aperture Spherical 
Radio Telescope (FAST)!°" is under construction in Guei-Zhou, China. Since the 
305 m radio telescope of Arecibo Observatory has been working for 52 years, we 
expect that FAST will work for more than 50 years also. In obtaining a fiducial 
sensitivity, we assume the FAST PTA to observe 50 pulsars with 50 ns timing 
accuracy for a 50 year time span. The Square Kilometre Array (SKA)1° in South 
Africa and Australia will certainly improve on existing limits and we assume pulsar 
timing measurements every two weeks for 100 pulsars with 20 ns timing accuracies 
for 100 years. Table 5 lists the basic assumptions for IPTA, FAST and SKA and 
their projected sensitivities in By, on the characteristic strain. 


Table 5. Sensitivities of IPTA, FAST and SKA to monochromatic GWs. 


No. No. of Timing Sensitivity in characteristic strain 
of pulsars years accuracy he(f)[= Byr(f/yr71)] 
of observation (ns) for monochromatic GWs 
IPTA106 36 20 100 Bye = 1190-19 
FAST107 50 50 50 Byy = 1.5 x 107-17 


SKA108 100 100 20 Byy = 1.5 x 10718 
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The sensitivity curves of IPTA, FAST and SKA: 


he(f) = 1x 107'°(f/yr™), 


1.58 x 10-9 Hz < f <8.27x10-"Hz, for IPTA-20, 


he(f) = 1.5 x 107" (f/yr™), 


6.34 x 10-19 Hz < f <8.27x10-" Hz, for FAST-50, 


holfy= 1b x 10°" (Ff he), 
3.17 x 10719 Hz < f <8.27x 10-7 Hz, for SKA-100 


are plotted on Fig. 2. The corresponding sensitivity curves in terms of [S),(f)]!/? 


and Q.(f) are plotted in Figs. 3 and 4, respectively. We note that the SKA 
sensitivity for monochromatic GWs reaches 10~?? in Q,(f) at frequency around 
3.17 x 10-1°Hz. The acronyms for these curves are IPTA-20, FAST-50 and 
SKA-100. 

As to the single source GW limits. The bounds from PPTA!? and EPTA"!® are 
in the order of 10~'4 for h, in the frequency range 5 x 10~° to 2 x 10~". They are 
drawn on Fig. 2 with the corresponding curves on Figs. 3 and 4. A 24-Hour Global 
Campaign for GW from J1713+0747 gives upper limits in the frequency range 
10-°—10~? Hz; the solid line shows the upper limit in random direction while the 
dotted line show the upper limit in the direction of pulsars.!!! 


3.7. Ultra-low-frequency band (10 fHz-300 pHz) 


GWs with periods longer than the time span of observations produce a simple 
pattern of apparent proper motions over the sky.t!? Therefore, precise measure- 
ment of proper motion of quasars would be a method to detect ultra-low-frequency 
(10 fHz-300 pHz) GWs. Gwinn et al.'!? used this method to constrain the normal- 
ized spectral energy density of stochastic GWs with frequencies less than 2 x 10~° Hz 
and greater than 3 x 107'® Hz (including frequencies in the ultra-low-frequency 
band) to less than 0.11 h~? (95 % confidence) times the critical closure density 
of our universe. In Fig. 4, we use the Planck 2015 value?° of Hubble constant 
Ho = (67.8+0.9)km s~! Mpc! to set h = 0.678 in their original plot and obtain a 
bound of 0.24 in terms of the critical density (the bound is labelled Quasar Astrom- 
etry (QA) in Fig. 4). Long baseline optical interferometer with sub-micro-arcsecond 
and nano-arcsecond (nas) astrometry is technologically feasible.!4 With this kind 
of interferometer implemented, precision astrometry of quasar proper motions may 
possibly be improved by four orders of magnitude and reach nas yr~!. In terms 
of energy, the precision of determining/constraining Q,.(f) could reach a sensitiv- 
ity of 2.4 x 10~° or better (Fig. 4; the curve is labelled QAG [Quasar Astrometry 
Goal]). 
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Using (60) or Table 2, we have the bound on characteristic strain h.(f): 
he(f) < 4.2 x 10719 (Hz/f) for 3x 107-'8Hz < f <2x 10° Hz. (76) 


When the angle resolution is improved by four orders of magnitude, the sensitivity 
will reach 


iff) =428 10-7 (Ha) f) for 3x10" He = f <2 x 10" Be, (77) 


Both the bound (76) and the curve (77) are plotted on Fig. 2. They are labelled 
QA and QAG. Using (60), we also convert Qgw(f) to [Sn(f)]!/* and plot the results 
on Fig. 3. 


3.8. Extremely-low (Hubble)-frequency band (1 aHz—10 fHz) 


The successful prediction of nucleosynthesis of primordial abundances of *He, 
“He, “Li and deuterium put a constraint on integrated tensor perturbations 
J d(log f)Qew(f) of 107°.1!© This is plotted on Fig. 4 as the Qgw(f) = 107° 
line. CMB experiments are most sensitive to the extremely-low (Hubble)-frequency 
band (1 aHz-10fHz). First, a strong GW background at extremely-low-frequency 
produces stochastic redshift of CMB (Sachs—Wolfe effect; S-W effect).1'611” The 
COBE observation gives CMB S-W redshift fluctuation bound which was plotted 
on Figs. 2-4 as CMB S-W fluct. The COBE microwave-background quadrupole 
anisotropy measurement!!®:!19 gives a limit Qgw(1aHz) ~ 10~° on the extremely- 
low-frequency GW background.!2°:!2+ WMAP!22~1!24 improves on the COBE con- 
straints; the constraint on Qew for the higher frequency end of this band is better 
than 10714. Planck Surveyor space mission has recently probed anisotropies with 
1 up to 2000 and with higher sensitivity. Ground and balloon experiments probe 
smaller-angle anisotropies and, hence, higher-frequency background. ACT pol has 
probed anisotropies with 1 from 225 up to 8725.!2° These CMB experiments probe 
the 1 aHz-10 fHz extremely-low (Hubble) frequency band GWs. In inflationary cos- 
mology these GWs give the tensor mode density and temperature perturbations 
(imprints) on CMB. 

Inflation postulates a rapid accelerated expansion which set the initial moments 
of the Big Bang Cosmology.!?°—!%! Expansion drives the universe towards a homo- 
geneous and spatially flat geometry that accurately describes the average state of 
the universe. The quantum fluctuations in this era grow into the galaxies, clusters 
of galaxies and temperature anisotropies of the CMB.!22—!8" Modern inflation has 
been originated from efforts of unification, but its mechanism still remains unclear. 
The quantum fluctuations in the spacetime geometry in the inflationary era gener- 
ate GWs which would have imprinted tensor perturbations on the CMB anisotropy. 
There is no confirmed discovery of these tensor perturbations yet (Refs. 138 and 
139 and references therein). The analysis of Planck, SPT and ACT temperature 
data together with WMAP polarization did not discover these tensor perturbations 
and showed that the scalar index is n, = 0.959 + 0.007 and the tensor-to-scalar 
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perturbation ratio r is less than 0.11 (95% CL; no running).!4° The pivot scale of 
this constraint is 0.002 Mpc~!, corresponding to GW frequency f ~ 1.5 x 10718 Hz 
at present. From Refs. 141 and 117, this constraint corresponds to Qew < ij-* 
and he < 2.34 x 107°; it is plotted on Figs. 2-4 with label Planck. In March 2014, 
the announcement of BICEP2 of the detection of B-mode polarization excess and 
their interpretation of this excess as imprint from primordial tensor waves immedi- 
ately attracted the imagination of the scientific community and the public.!4? The 
September 2014 announcement of dust measurement including the BICEP2 obser- 
vation region from the Planck team convinced the physics community that the 
excess is consistent with dust emission.'4? The new Keck Array data‘*® confirmed 
the BICEP2 B-mode polarization excess. The combined analysis of BICEP2/Keck 
Array and Planck Collaboration!*® 
tent with the Planck dust measurement and that the tensor-to-scalar perturbation 
ratio r is constrained to less than 0.12 (95% CL; no running). The pivot scale of 
this constraint is 0.05 Mpc~!, corresponding to GW frequency f ~ 3.8 x 107!" Hz 
at present. From Refs. 141 and 117, this constraint corresponds Qew < 1.1 x 107! 
and h, < 5.57 x 107°; it is plotted on Figs. 2-4 with label BICEP2/ Keck. 

The sources for B-mode polarization in CMB could come from (i) GW imprints 
on CMB; (ii) gravitational lensing during the CMB propagation and (iii) pseudo- 
scalar-photon interaction during the CMB propagation. In the BICEP2 analysis,!*” 
gravitational lensing effect is subtracted. Einstein equivalence principle dictates 
that the propagation of electromagnetic waves (photons) observes Maxwell equa- 
tions locally and there is no rotation of polarization plane during propagation |i.e. 
no cosmnic polarization rotation (CPR)]. However, this is exactly a soft spot in the 
empirical foundation of EEP.'44 For a survey of constraints on CPR from astro- 
physical and cosmological observations, see di Serego Alighieri’s review.'° Basi- 
cally, both the mean CPR and the CPR fluctuation magnitude are constrained to a 
couple of degrees. For example, from the ACTpol CMB polarization data fitting,'° 
the mean CPR angle a is constrained from the EB correlation power spectra to be 
less than about 1° and the fluctuation (rms) is constrained from the BB correlation 
power spectra to (da?)!/? < 1.68°. Including CPR effect together with the Planck 
dust measurement in a joint fitting of ACTpol, BICEP2, and POLARBEAR gives 
the values of the mean squares of the CPR. fluctuation (da?) = 41 + 522 [mrad] 
and the tensor-to-scalar ratio r = —0.012+0.109; this in turn gives a 1 o bound on 
the rms of the CPR magnitude (da?)!/? < 23.7 mrad (1°.36) and that of r < 0.097. 
This result not only gives the best constraint on the CPR fluctuation magnitude, 
but also is consistent with the Planck and the joint BICEP2/Keck ArrayPlanck 
bound. 

The ongoing situation as said in view given by Halverson! 


convincingly showed that this excess is consis- 


39 is “The competition 


is fierce, with at least six funded ground-based experiments underway (including 
the third version of BICEP), several balloon-borne experiments, and a number 
of proposed space missions. Finally, thanks to the new data, galactic foreground 
contaminants — and strategies for removing them — are now better understood.” 
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The present consensus is that when the present ground-based and balloon-borne 
experiments are performed, the accuracy in determine r will have one order of 
magnitude improvement to 0.01; when the proposed space missions are flown and 
completed, the accuracy will have another order of magnitude improvement to 
0.001. This means that the sensitivity in the Qgw-f plot will be improved to 107". 


4. Sources of GWs 


In this section, we discuss sources of GWs concisely while refer to various references 
for more complete treatment. 


4.1. GWs from compact binaries 


Binary neutron stars coalesce by losing kinetic energy of orbital motion due to the 
emission of GW. When the orbital radius is much larger than the radius of stars, the 
radiation of GW is described by the quadrupole approximation reviewed in Sec. 2 
until merging starts where two stars are deformed by each other’s tidal forces.'47 
The wave signal chirps according to the advancement of time and the frequency 
ranges from the low-frequency orbital motion period to high frequency merger char- 
acteristic frequency (~ 1kHz). Since the amplitude of the signal increases till the 
merger (inspiral phase), the signal of this inspiral phase is the most probable target 
of all ground-based laser interferometers with km-scale baseline length. 

There are several neutron star binaries in our Galaxy (in the case of J1906+0746 
the companion star may be a white dwarf). In Table 6, all that may coalesce due 
to the emission of GW within the age of the universe are listed. After the merger, 
coalesced neutron stars form a black hole and it oscillates due to dynamical energy 
of the coalescence just after the merger, which is known as quasi-normal mode 
oscillation of black hole. Since its typical frequency is several kHz for black hole 
with a few Mo, the oscillation (ring down) is the GW target of detectors that have 
sensitivity at high frequencies such as GEO-HF detector or cryogenic mechanical 
detectors. 


Table 6. Neutron star binaries in our galaxy that may coalesce within the age 
of the universe (Companion star of J1906 + 0746 may be a white dwarf). Ps is 
the pulse period, P, the orbital period of the binary system, e the eccentricity 
of the orbit and 7)if¢ the life time of the binary system. 


Ps (ms) Pp (hr) e€ Tlife (Gyr) 
B1913+16° 59.03 7.75 0.62 0.37 
B1534+12!48 37.40 10.10 0.27 2.93 
J0737-3039A 149 22.70 2.45 0.088 0.23 
J1756-2251150 28.46 7.67 0.18 2.03 
J1906+0746151 144.14 3.98 0.085 0.082 


J21274+110152 32.76 8.04 0.68 0.32 
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The coalescence rate of binary neutron stars is estimated by knowing both 
the distribution of binaries and the life time of the binary systems. Due to the 
small number statistics and to the uncertainty biases of pulsar observation (e.g. 
dissipation of electromagnetic waves in our Galaxy, beaming angle, faint pulsars, 
etc.), the estimated event rate ranges more than two orders of magnitude, where 
typical value is once per 100,000 years in such matured galaxy as ours.!°? Since the 
population of such matured galaxy is roughly 0.01 per cubic Mpc, at least one event 
per year can be detected if the sensitivity to catch events occurring at 130 Mpc 
is achieved by ground-based detector. Advanced LIGO has initiated observation 
run 1 (O1) starting 18th September 2015 with sensitivity reaching 70 Mpc for the 
coalescence of nominal binary neutron stars. 

In the coalescence of binary black holes, the frequency of the chirping signal 
shifts down to lower frequencies. If their initial mass ranges around 10 Mo, the 
merger may occur at around 200 Hz. The signal is in the most sensitive frequencies 
of the second generation ground-based interferometers. 

Dominick estimated that the population and the coalescence rate of binary black 
holes is smaller than that of binary neutron stars.!°4 However, a theoretical study 
shows that merger rate of black holes based on ejections from globular clusters 
is larger than that of neutron star binaries.!°° This is still an issue of different 
opinions. Moreover, since the amplitude of GWs from the coalescence of black holes 
is larger, possible detection rate will be larger if the detector has sensitivity at lower 
frequencies (<~ 10 Hz), which will be realized by the third-generation detectors. 

We plot the source strengths of compact binary inspirals, pulsars, resolvable 
galactic binaries and unresolvable galactic binaries [confusion background]*}1°° 
Figs. 2-4 by adopting those of Moore et al.*° 


in 


4.2. GWs from supernovae 


Massive stars heavier than 8 Me collapse due to gravity after burning out and a 
neutron star may be born. This collapse produces burst GW. Taking the second 
derivative of the quadrupole moment of the star and using (35), the maximum 
amplitude hmax is estimated to be ¢MR?(27f)?, where ¢ [of the order of Gy/(c*r) 
times nonsphericity of the explosion] is a calculable numerical factor, M is the mass 
of the initial neutron star born just after the collapse, R is the radius of the star 
and r is the observation distance to the star. If the collapse occurs in the center 
of our galaxy in a favorable condition, the burst wave signal may be detected by 
resonant antennas. Also it is a target source of GW of ground-based interferometric 
detectors.1°" 

The GW form information is useful to enhance the signal-to-noise ratio of the 
detector. Since stellar core collapse is a complex physical phenomena that involves 
GR, hydrodynamics, and neutrino transport with thermonuclear kinetics in short 
time duration, it is not easy to conduct a full numerical simulation to obtain the GW 
form. In 2002 Dimmelmeier et al.'°® first performed axisymmetric hydrodynamic 
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simulations of rotational core collapse and its associated GW emission in 26 general- 
relativistic and Newtonian models. The total energy of GWs emitted is only about 
10-*—10-§ Moc?. Recent development in three-dimensional numerical simulation 
which requires longer computing time shows that strong burst GW of total luminos- 
ity of 0.01 Moc? can be produced by an initially nonrotating star due to standing 
accretion shock instability (SASI).1°9:1®° For a review on this subject, see Ref. 161. 
These may be the plausible candidates for the second-generation ground-based 
detectors. The event rate of supernova explosions in our Galaxy is estimated as 
once per 40 + 10 yr.'®? According to Abadie et al.,1°? supernova explosion with 
GW energy 0.056 Mec? could be detected at 16 Mpc with LIGO-Virgo achieved 
sensitivity; hence supernova explosion with GW energy ~ 0.01 Moc? should be 
detected up to 6.8 Mpc with the LIGO-Virgo achieved sensitivity; if supernova 
explosion is always accompanied with the emission of GW energy of ~ 0.01Moc?, 
the detection rate on the Earth would be 0.04/yr assuming uniform distribution of 
such galaxies as ours to be 0.01 Mpc~?. This rate would nominally be improved to 
1.7 yr-+ by adLIGO at present sensitivity (3.5 fold improvement compared with 
Ref. 163). However, since there is a large uncertainty in the distribution of GW 
energy strength in the supernova explosion, we just adopting the strength as given 
by Moore et al.®° for plotting in Figs. 2-4. 


4.3. GWs from massive black holes and their coevolution 
with galaxies 


Observational evidences indicate that massive black holes (MBHs) residing in most 
local galaxies. Relations have been discovered between the MBH mass and the 
mass of host galaxy bulge, and between the MBH mass and the velocity-dispersion. 
These relations indicate that the central MBHs are linked to the evolution of galac- 
tic structure. Newly fueled quasar may come from the gas-rich major merger of 
two massive galaxies. Recent astrophysical evidences linked together these major 
galaxy mergers and the growth of supermassive black holes in quasars.!4:! Dis- 
tant quasar observations indicate that MBH of billions of solar masses already 
existed less than a billion years after the Big Bang. At present, there are different 
theoretical proposals for scenarios of the initial conditions and formations of black 
holes. These scenarios include BH seeds from inflationary universe and/or from the 
collapse of Population III stars, different accretion models and binary formation 
rates. All of these models generate MBH merging scenarios in galaxy co-evolution 
with GW radiations. Measurement of amplitude and spectrum of these GWs will 
tell us the cosmic history of MBH formation. 

The standard theory of MBH formation is the merger-tree theory with vari- 
ous MBH Binary (MBHB) inspirals acting. The GWs from these MBHB inspirals 
can be detected and explored to cosmological distances using space GW detectors. 
Although there are different merger-tree models and models with BH seeds, they 
all give significant detection rates for space GW detectors and PTAs. 100:166—168 
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GW observation in the 300pHz-0.1Hz frequency band will be a major obser- 
vation tool to study the coevolution of galaxy with BHs. This frequency band 
covers the low frequency band (100nHz-100mHz) and very-low-frequency band 
(300 pHz-100nHz) GWs and is in the detection range of PTAs, eLISA/LISA- 
like/Earth-orbiting-missions and ASTROD-GW. PTAs are most sensitive in the fre- 
quency range 300 pHz-100 nHz, eLISA/LISA-like/Earth-orbiting space GW detec- 
tor is most sensitive in the frequency range 2 mHz-0.1 Hz, while ASTROD-GW is 
most sensitive in the frequency range 500 nHz-2 mHz (Figs. 2-4). 

PTAs have been collecting data for decades for detection of stochastic GW 
background from MBH binary mergers. They already constrain Ay, in Eq. (74) to 
less than 1.0 x 10-1 with 95% confidence.!°?-1%4 This limit excludes present and 
most recent model predictions of supermassive black hole formation with 91-99.7% 
probability.'°? This means the detection could be anytime near. Since we know 
that SMBHs are already formed, it also means that the backgrounds in the higher 
frequency/shorter wavelength band are higher than original predicted. For most 
models there is a knee around f ~ 100 nHz, now we straighten the knee and extend 
Eq. (74) to f ~ 10 wHz with dashed line in Fig. 2. Below this 1.0 x 10~1° limit, we 
plot pink colored region to show possible background source region. Corresponding 
line and colored region are also shown in Figs. 3 and 4. 

eLISA and ASTROD-GW will be able to directly observe how MBHs form, 
grow and interact over the entire history of galaxy formation. ASTROD-GW will 
detect stochastic GW background from MBH binary mergers in the frequency range 
500 nHz-100 Hz. These observations are significant and important to the study of 
co-evolution of galaxies with MBHs. The expected rate of MBHB sources is 10 yr~!— 
100 yr~! for eLISA and 10 yr~!—1000 yr~! for LISA.®* For ASTROD-GW, similar 
number of sources as that of LISA is expected with better angular resolution.®* For 
a more detailed account, see Ref. 169. 

At present, there are different theoretical scenarios for the initial conditions and 
formations of black holes, e.g. primordial MBH clouds as seeds, direct formation of 
supermassive black hole via multi-scale gas inflows in galaxy mergers, direct collapse 
into a supermassive black hole from mergers between massive protogalaxies with 
no need to suppress cooling and star formation, etc. The mass range and maximum 
mass of Population III stars is also a relevant issue for seed BHs. With the PPTA 
constraint, there should be more backgrounds in the Hz region. ASTROD-GW 
with good sensitivity in the “Hz band will contribute to detect or constrain GW 
background to distinguish various scenarios for finding the history of BH and galaxy 
coevolution. 

With the detection of MBH merger events and background, the properties and 
distribution of MBHs could be deduced and underlying population models could be 
tested. Sesana et al.!” consider and compare ten specific models of MBH formation. 
These models are chosen to probe four important and largely unconstrained aspects 
of input physics used in the structure formation simulations, i.e. seed formation, 
metallicity feedback, accretion efficiency and accretion geometry. With Bayesian 
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analyses to recover posterior probability distribution, they show that LISA has 
enormous potential to probe the underlying physics of structure formation. With 
better sensitivity in the frequency range 100nHz—1 mHz, ASTROD-GW will be able 
to probe the underlying physics of structure formation further. With the detection 
of the GW background of the MBH mergers, PTAs and ASTROD-GW will add to 
our understanding of the structure formation. 

We plot the source strengths of massive binaries in Figs. 2-4 adopting those of 


Moore et al.2® 


4.4. GWs from extreme mass ratio inspirals (EMRIs) 


EMRIs are GW sources for space GW detectors. The eLISA sensitive range for 
central MBH masses is 10* — 10’ Mo. The expected number of eLISA detections 
over two years is 10-20;%4 for LISA, a few tens;°* for ASTROD-GW, similar or more 
with sensitivity toward larger central BH’s and with better angular resolution.®* 
For a more detailed account, we refer to Ref. 169. We plot the source strengths of 
EMRIs in Figs. 2-4 adopting those in Moore et al.°° 


4.5. Primordial/inflationary/relic GWs 


Relic GWs from inflationary or noninflationary period are commonly called primor- 
dial GWs. Relative to primordial GWs, all the GW sources we have discussed are 
foregrounds. Assuming the primordial GW spectrum is flat in the Q,.,(f) versus 
f diagram, i.e. the tensor index n, is 0, we draw an upper bound of inflationary 
spectrum to saturate the constraints given in Sec. 3.8; it is the flat line (the ten- 
sor index nz is 0) about 10~1° level in the Qgw(f) versus f diagram (Fig. 4) with 
the very high frequency part dropping steeply above 10!° Hz. For comparison, the 
black dotted curve shows the corresponding Qgw(f) for a 0.9 K blackbody radia- 
tion. If the GW perturbations had been in equilibrium with the matter fields, it is 
an expected GW background. We refer the readers to the recent review by Sato and 
Yokoyama on “Inflationary cosmology: First 30+ years”!”! for a detailed account 
of the inflationary scenario. 

As expected in Sec. 3.8, the present consensus on the CMB B-polarization mea- 
surements is that when the present ground-based and balloon-borne experiments 
are performed the sensitivity in the Qew-f plot will have a one-order improvement 
to 10-'6 and when the proposed space missions are flown and completed the sen- 
sitivity will have another order of magnitude improvement to 1071". 

The instrument sensitivity goals of DECIGO,” Big Bang Observer’? and 6-S/C 
ASTROD-GW® all reach the 10~!”-level in terms of Qgw (Fig. 4). The sensitivities 
of IPTA, FAST and SKA also reach the 10~1’-level or beyond in terms of Qew 
(Fig. 4). These instrument sensitivities are good enough to probe primordial GWs 
down to the 10~'7-level or beyond in terms of Qgw at frequencies around 1nHz, 
10—300 wHz and 0.1—-1Hz to search and test inflationary/noninflationary physics. 
The main issue is the level of foreground and whether foreground could be separated. 


T-496 K. Kuroda, W.-T. Ni and W.-P. Pan 


4.6. Very-high-frequency and ultra-high-frequency GW sources 


There are four kind of potential GW sources in the very-high-frequency and ultra- 
high-frequency bands®?: 


(i) Discrete sources.!7? 


) 

(ii) Cosmological sources.173 
) 
) 


(iii) Braneworld Kaluza—Klein (KK) mode radiation.17:!”° 


(iv) Plasma instabilities.1”° 


In general, objects do not radiate efficiently at wavelengths very different from 
their size. This implies objects radiate at these bands need to be very small and 
yet have a very large energy concentration to induce significantly large curvature 
fluctuations. Grishchuk!”? estimated the GWs generated from the amplification of 
quantum fluctuations by inflation. GWs in these bands with current wavelengths 
would had very short wavelengths that new physics might be working in the period 
of generation. However, the nucleosysthesis bound of Qgw(f) ~ 10~° must be satis- 
fied by the spectrum of any GW background.”* h, at 100 MHz, 10 GHz and 1 THz 
would need to be less than 9.5 x 107°, 9.5 x 1073! and 9.5 x 107%%, respectively. 
The actual signals may be much lower. Various theoretical models!”’~1** predict 
GWs at levels from Qgw(f) ~ 1078 to below ~ 10718. See Ref. 39 and references 
therein for more details. 

To close this subsection, we quote from Ref. 39: “Even assuming the most opti- 
mistic noise temperatures and the highest magnet strengths, detection of the cosmo- 
logical signals look beyond reasonable extrapolation of current performance whereas 
very-high-frequency GWs from braneworld scenarios may be within range of cur- 
rent technology. The most optimistic plasma instability signals from our galaxy if 
they occur at the low-frequency end of the range could also be above the sensi- 
tivity of future microwave detectors. ... There may also be astrophysical processes 
that convert violent electromagnetic events into very-high-frequency gravitational 
sources that could be detected but more targeted modelling is needed to identify 
candidate astronomical objects. The technology for detectors which convert the GW 
directly to an electromagnetic signal is currently available and builds on decades of 
development for other applications.” 


4.7. Other possible sources 


Cosmic strings are popular GW sources in many theoretical investigations. For 
possible GW magnitudes in various bands of cosmic-string contribution, please see 
Ref. 185 and references therein. Recently, Geng et al.!°° 
of strange-quark planets with strange stars as a new kind of GW burst sources for 
ground-based interferometers. As GW astronomy and GW physics progress, there 
could be detected GW sources of various different origins. This is open until the 
experiments and observations are performed. 


proposed the coalescence 
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5. Discussion and Outlook 


In spite of tremendous efforts in the high frequency band and some efforts in the 
very high frequency band experiments, GW has not been directly detected yet. This 
is due to the weakness in the strength of GWs in the present epoch. 

The first generation of km-sized arm length interferometers reach the sensitivity 
of detecting binary neutron star inspirals up to the Virgo cluster distance. From 
the statistics of astrophysical binary neutron star distribution, the rate of detection 
is about 0.05 events per year with a large uncertainty. However, with a ten-fold 
increase of strain sensitivity, the reach in distance increases by ten-fold and the reach 
in astrophysical volume increases by one thousand fold. Hence, the rate of detection 
is about 50 events per year. This is the goal of Advanced LIGO,*° Advanced Virgo*® 
and KAGRA/LCGT°’ under construction. Advanced LIGO has achieved 3.5 fold 
better sensitivities with a reach to neutron star binary merging event at 70 Mpc 
and began its first observing run (O1) on September 18, 2015 searching for GWs. 
We could expect detection of GWs anytime. We will see a global network of second 
generation km-size interferometers for GW detection soon. 

Another avenue for real-time direct detection is from the PTAs. The PTA bound 
on stochastic GW background already excludes most theoretical models; this may 
mean we could detect very-low-frequency GWs anytime too with a longer time 
scale. 

We have presented a complete frequency classification of GWs according to their 
detection methods. Although there is no direct real-time detection of GWs yet, 
several bands are amenable to direct detection. Real-time direct detection may first 
come in the high frequency band or in the very-low-frequency band. Although the 
prospect of a launch of space GW is only expected in about 20 years, the detection 
in the low-frequency band may have the largest signal-to-noise ratios. This will 
enable the detailed study of black hole coevolution with galaxies and of the dark 
energy issue. Foreground separation and correlation detection method need to be 
investigated to achieve the sensitivities 10~'©—10—!" or beyond in Qew to study the 
primordial GW background for exploring very early universe and possibly quantum 
gravity regimes. 

When we look back at the theoretical and experimental development of GW 
physics and astronomy over the last 100 years, there are many challenges, some 
pitfalls and during last 50 years close interactions among theorists and experimen- 
talists. The subject and community have become clearly multidisciplinary. One 
example is the interaction of the GW community and the Quantum Optics commu- 
nity in the last 40 years to identify standard quantum uncertainties in measurement, 
to realize that this is not an obstacle of measurement in principle and to find ways 
to overcome it. Another example is the interaction of the physics community and 
the astronomy community to understand and to identify detectable and poten- 
tially detectable GW sources. With current technology development and astro- 
physical understanding, we are in a position using GWs to study more thoroughly 
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galaxies, supermassive black holes and clusters together with cosmology and to 
explore deeper into the origin of gravitation and our universe. Next 100 years will 
be the golden age of GW astronomy and GW physics. The current and coming 
generations are holding such promises. 


Note added in proof: After this review appeared in arXiv, Refs. 187-189 have 
been brought to our attention that the pulsar timing method can also detect the 
imprint of a stochastic GW background on pulsar timing parameters in the ultralow 
frequency range down to f ~ c/r where r is the distance to the pulsar. We thank 
Maxim Pshirkov for his helpful communication. 
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Gravitational wave is predicted by Einstein’s general relativity, which conveys the infor- 
mation of source objects in the universe. The detection of the gravitational wave is 
the direct test of the theory and will be used as new tool to investigate dynami- 
cal nature of the universe. However, the effect of the gravitational wave is too tiny 
to be easily detected. From the first attempt utilizing resonant antenna in the 1960s, 
efforts of improving antenna sensitivity were continued by applying cryogenic techniques 
until approaching the quantum limit of sensitivity. However, by the year 2000, resonant 
antenna had given the way to interferometers. Large projects involving interferometers 
started in the 1990s, and achieved successful operations by 2010 with an accumulated 
extensive number of technical inventions and improvements. In this memorial year 2015, 
we enter the new phase of gravitational-wave detection by the forthcoming operation of 
the second-generation interferometers. The main focus in this paper is on how advanced 
techniques have been developed step by step according to scaling the arm length of the 
interferometer up and the history of fighting against technical noise, thermal noise, and 
quantum noise is presented along with the current projects, LIGO, Virgo, GEO-HF and 
KAGRA. 


Keywords: Gravitational wave; detection; ground based. 


PACS Number: 04.80.Nn 


1. Introduction to Ground-Based Gravitational-Wave Detectors 


The gravitational wave is predicted by Einstein’s general relativity, which is 
produced by dynamic acceleration of celestial objects. The gravitational wave prop- 
agates in a speed of light with inducing spacetime distortion around. The detec- 
tion of gravitational wave is the direct test of the theory of general relativity and 
becomes a new tool of astronomy to observe dynamic nature of the universe. How- 
ever, the effect of gravitational wave is too tiny to be easily detected. The first 
experiment was attempted by J. Weber with his resonant-type antennas placed 
at Argonne National Laboratory and at the University of Maryland. He declared 
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that he had succeeded in the detection of gravitational wave coming from the cen- 
ter of our Galaxy.! Although the detection was discredited by several independent 
experiments based on a pair of antennas,? © this opened an era of experimental 
astrophysics of gravitational waves. Another type of gravitational-wave detection 
on Earth is the laser interferometric detector. 

Full expertise in several discipline of experimental physics is required in order to 
realize an advanced interferometers. In order to give the reader an idea of the deep 
knowledge and variety of arguments required, the author tried to include selected 
R&D items that were confirmed by experimental facts. 

In this paper, the detection of gravitational waves is reviewed for both a reso- 
nant antenna and a laser interferometer after a brief summary of current status of 
gravitational-wave detection. 


1.1. Gravitational-wave sources 


Gravitational-wave sources evolves along with the advancement of R&D of 
gravitational-wave detector. In the era of resonant antennas, a typical source of 
gravitational wave is supernova explosion and the signal amplitude was roughly 
estimated from the energy balance between released gravitational-wave energy and 
the total energy of the gravitational wave propagating in the universe. In the era 
of laser interferometer that has wider frequency-band of observation, main target 
of the gravitational-wave source is the coalescence of compact binary stars. We are 
now in the stage where the second-generation detectors initiate their operations 
and are making designs for third-generation interferometers that have sensitivity 
better by one order than the second-generation ones, which can explore much more 
abundant gravitational-wave sources, which are summarized in Ref. 7. 

Currently achieved sensitivities of large projects for gravitational-wave detection 
are introduced and gravitational-wave sources are presented in this subsection. 


1.1.1. Achieved sensitivities of large projects 


The first-generation laser interferometers are realized in the world as LIGO Hanford 
(1) and (2), LIGO Livingston,® Virgo, GEO,'° and TAMA." They have achieved 
design sensitivities by the mid of 2000s. Depending on their baseline length, the 
highest sensitivity is attained by LIGO 4km interferometer and the most remote 
target of neutron star binaries is located at 50 Mpc away. The operation of these 
first-generation detectors ended by cooperative observation in 2010.12 We have no 
report of the detection of gravitational waves by this observation. Considering the 
event rate of neutron star binaries, we have to let these detectors run for more than 
100 years on average. If we assume the coalescence of black hole binaries of masses 
of 10Mo, we can limit its birth rate down to 6 x 10~°/year by this observation. 
The second-generation detectors are designed to achieve more sensitivity and 
they are being constructed. They are advanced LIGO (Hanford and Livingston),'® 
advanced Virgo,!4 GEO HF,!® and KAGRA.!° The interferometer of LIGO-India 
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may be initiated to be constructed soon. If we achieve the first detection of grav- 
itational wave, the astronomy of gravitational waves will start. At this phase, we 
have to make the frequency band wider for more detections of various types of 
gravitational-wave sources. Especially, increased the sensitivity of lower frequency 
will be achieved by a cryogenic interferometer placed underground. This kind of 
detectors is called the third-generation interferometer. KAGRA will be positioned 
between the second and the third-generation due to the adoption of cryogenic mirror 
and underground location. 

European scientists plan to construct more sensitive gravitational-wave detector, 
Einstein Telescope, the design study of which was finished in 2011.17 

The achieved sensitivities of laser interferometers used for observation are plot- 
ted by broken line curves and the sensitivities of the second- and third-generation 
detectors are shown by solid line curves in Fig. 1. Note that curves show the value 
of noise spectrum multiplied by square root of frequency, which makes easy to 
compare the characteristic amplitudes of expected gravitational wave signals from 
various sources except continuous gravitational waves. 


a 
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—_ = 
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eH 


Unstable rotating NS, 20Mpc, ET 
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Fig. 1. The achieved sensitivities of laser interferometers, operated for observation, are plotted 
by broken line curves and the sensitivities of the second- and third-generation detectors are shown 
by solid curves (KAGRA: Broadband RSE, aLIGO:ZERO_DET-_high_P, aVirgo:Wideband SR- 
tuning). Typical gravitational wave sources are also shown in the figure. 
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In this figure, examples of gravitational-wave sources, which can be possibly 
detected by the ground based interferometers, are shown. They are described as 
follows. 


1.1.2. Coalescences of binary neutron stars 


There are several neutron star binaries in our Galaxy and some of them coalesce 
due to the emission of gravitational wave within the age of the universe.!® In a few 
minutes before the coalescence, chirping gravitational wave is emitted and its signal 
is the most probable target of ground based laser interferometers. The radiation 
of gravitational wave by the orbital motion of binary pulsars is described by the 
quadrupole approximation until merging start where two stars may be deformed 
by each tidal forces.!9 Although its orbital radius is difficult to be known due to 
the unknown merging radius of neutron stars because the state equation of neutron 
is not well known for extensively high density state of neutron stars. Therefore, 
the analytical calculation is done until the orbital radius approaches 6 times of the 
neutron-star radius (inspiral phase). The radiation at approaching closely and at 
merging (merger phase) can be only handled only by numerical calculation.?° 


_ [Beg (Caer é (1) 
~ V 24n3 D 3 , 


h(f) 


where @ is a factor representing the dependence of both inclination angles to the 
source position and the inclination angle to the direction of the orbital plane of 
the binary stars, D is the distance from the source to the Earth, and My is the 
chirp mass calculated by usM 3, where is the reduce mass of the binary, “y7", 
M is the total mass of the binary, M = m+ mz. G is the Newtonian gravitational 
constant, and c is the speed of light. The case satisfying D = 100 Mpc, Q = 1 and 
for m, = mg = 1.4 Mo, where Mo is the mass of the sun, is plotted in Fig. 1. 


The estimated event rate ranges more than 2 orders of magnitude and typical 
21,22 


value is once per 100 thousands years in such matured galaxy as ours. 


1.1.3. Coalescences of binary black holes 


If there is a binary black holes that are rotating each other, the system radiates 
gravitational wave according to the quadrupole formula as long as the orbital radius 
is fairly larger than the radius of each event horizon of the black hole. As in the case 
of a binary neutron stars, the system gradually loses its dynamical energy due to 
the radiation of gravitational wave and its orbit radius decreases until its merger. 
A numerical analysis of coalescence of binary black holes with spin is calculated by 
numerical method.?? Since the population of binary black holes is estimated to be 
smaller than that of neutron star binaries, the coalescence rate of the coalescence 
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of black hole binaries is smaller than that of binary neutron stars.24 However, a 
theoretical study shows that merger rate of black holes ejected from globular clusters 
is larger than that of neutron star binaries.2> Moreover, since the amplitude of 
gravitational waves from the coalescence of black holes is larger, possible detection 
rate will be larger if the detector has sensitivity at lower frequencies less than 10 Hz, 
which will be realized by the third-generation detectors. 


1.1.4. Supernova explosion 


Massive stars heavier than 8 Mo collapse due to gravity after burning out and a 
neutron star may be born. This collapse produces burst gravitational wave. Since 
supernova explosion is a complex physical phenomena involving general relativity, 
hydrodynamics of nuclear density, neutrino transport, and thermonuclear kinetics, 
there is no established scenario widely accepted. The magnitude of the gravitational 
wave from stellar core collapse was estimated in order by second time-derivative of 
the quadrupole moment of the core. However, the wave form of the gravitational 
wave is useful to enhance the signal-to-noise ratio of the detection. By numeri- 
cal simulation, first wave form catalogue was calculated on rapidly rotating star 
considering general relativity effect in the collapse.?° Recent development in three- 
dimensional numerical simulation which requires higher computing power shows 
that stronger burst wave by one order than that in the first catalogue is produced 
due to nonaxisymmetric dynamical instabilities such as rapidly spinning bar-like 
core?’ and standing accretion shock instability. In any case, expected maximum 
amplitude hmax7? is calculated by assuming that the magnitude of second derivative 
of quadrupole moment, «MM R?(27f)?, where x is 0.1 for interesting case, M is the 
mass of the initial neutron star born just after the collapse, 1.4Mo, and R is the 
radius of the star, 20km. The source distance from the Earth is taken as 20 Mpc. 
Also, the factor 0.2 is assumed to show the departure from the symmetric figure of 
a sphere represented by 6; which is 0 for spherically symmetric collapse. 


_93 [ 20 Mpc K Or M R \ f 2 
max ~ 1 x 10 = D (5) 4) te) (ee) (siz) - @) 


The numerical value is plotted in Fig. 1. 

This may be the plausible candidate for the second-generation ground based 
detectors. The events rate of supernova explosions are estimated as once per a few 
ten years in our Galaxy. 


1.1.5. Quasi-normal mode oscillation at the birth of black hole 


After the merger of compact star binary, born black hole vibrates and its vibration 
decays due to energy release by gravitational wave, which is called as ring-down 
phase. In this phase, expected maximum amplitude assuming the energy released 
AE = €Moc? with e« = 10~® during a time scale of tg = 0.2ms at D = 20Mpc 
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away is?9 
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The observation of this oscillation also gives us the understanding about Kerr space- 


time geometry, if the initial state of the binary has a large angular momentum. 


1.1.6. Unstable fast rotating neutron star 


Angular momentum of the initial massive stars is taken over by new born neutron 
star, which is possibly a rapidly rotating spheroid with differential rotation speeds 
between inner sphere and outer sphere. This is unstable system and effectively radi- 
ate gravitational wave that damps the rotation speed. The emission of gravitational 
wave is described by the following formula® that is plotted in Fig. 1: 


1 1 

R [GM 91 (20Mpe\ ( R M \?( f \2 

oe wi x16 
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(4) 


More detailed description about the inspiral, merger and ring-down of a coales- 
cence of compact binary systems and references for other gravitational-wave sources 
are found in relevant chapters in the series of this publication. 


1.2. Acceleration due to a gravitational wave 


The detection principle of a gravitational wave is based on excitation due to the 
tidal force induced by a gravitational wave represented by a metric perturbation. In 
resonant antennae, the resonant vibration modes of an elastic body will be excited, 
but in a laser interferometer, the geodesic motions of freely suspended test masses 
are deviated.°° 

The metric perturbation by a gravitational wave propagating along the z- 
direction is described in TT gauge by 


ds” = —c?dt? + (1+ hap da? + 2hi | dady + (1+ hj, dy? + dz”, (5) 

where 
Ree = —Ayy = Ay(wt — kz), (6) 
hey = Ngo = Ax(wt — kz), (7) 


are used, and w is the angular frequency and k is the wave number of the wave. By 
using these metric perturbations, we can calculate the component of the Riemann 
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tensor as 
L, ~3s 
Reoxo = —Ryoyo = ~ ga A+ (wt = kz), (8) 
i 
Reoyo = Ryoxo = ~ 9a Ax (wt = kz), (9) 


where double dots stands for second derivative with respect to t. In the detector 
coordinates system, which is the proper reference frame of the experimental room, 
the wave induces the acceleration of a unit mass relative to the center of mass of 
the detector as 


dx 
( ) = —Rzor0t — ReoyoY; 


dt2 
1. . 
= — 9a (Ate + Axy), (10) 
d2 
(3) = —Ryoyoy _ Ryoxo®, 
1 ue ee 
=~ geal -Ary t+ Axa), (11) 


ae ay 


The energy carried by the wave can be represented by the tensor of the energy— 
momentum 


(A? + A2). (13) 


The force field produced by a gravitational wave is sketched by the force lines 
as in Fig. 2. 

The force caused by the acceleration is normal to the wave propagation direction. 
Let us assume for simplicity that there are two point masses. If these masses are 
bound by some elastic body, the acceleration between two masses causes stress 
inside this body. However, if two masses are free, the acceleration changes the 
distance between them. Contrary to this picture involving the proper reference 
frame of the detector, the acceleration in the TT gauge is given by a first-order 
approximation 


a’ a* ey 
( dt? ) a =e Ts, = 0, (14) 


where I represents the Christoffel symbol. The coordinate value of the mass does 
not change in the TT gauge even if a gravitational wave passes. Indeed, the proper 
length in the TT gauge, which is converted from the baseline length between two 
point masses in the experimental room, does not change due to a gravitational wave 
in the first-order approximation of the metric perturbation. 
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Fig. 2. Gravitational wave propagating along the z-axis which causes acceleration of a mass 
relative to the center of mass of the detector. The picture shows the force field by “+” polarization. 
The force field by “x” polarization is obtained by rotating the force field by 45 degrees around 
the z-axis. 


1.3. Response of a resonant antenna 


A resonant antenna is an elastic body whose mechanical resonance is excited by 
the stress due to gravitational waves. The response of the detector depends on the 
shape, rigidity and material density of the antenna.*! The shape of the antenna 
developed by J. Weber was a cylindrical bar.*? Cryogenic bar antennae were devel- 
oped by various research groups around the world. An analytical calculation for the 
bar-type antenna was given in the form of the absorption cross-section by Paik and 
Wagoner.** Here, the author introduces an analysis by Hirakawa et al.?4 The strain 
of the antenna body is described by a field of a displacement vector, ug, where a 
specifies one of components in three-dimensional coordinates. If a gravitational wave 
impinges on the antenna, the equation of motion of the antenna is represented by 


. alt és 
PUa — Aue _ (A ole L(V : U),0 = 9 ¥ haptp, (15) 
B 


where p is the density, and both yz and A are the Lame’s elasticity coefficients 
of the antenna material. We assume that the material is isotropic. In this case, 
Young’s modulus is given by “(3A + 2u)/(A + ww) and the Poisson ratio is by 
A/[2(A + )]. According to the above equation, since u(x, y, z,t) can be expanded 
by 30, en(t)wn(x,y, z), the amplitude, cp, of the nth mode, satisfies the following 
equation considering the internal mechanical loss, 1/Qn: 


ys. hopsQnap 
ap 


w 
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2 
dnag = Jo (oo + WnBla — 3008 Ene) dV, (17) 


where w, is the nth eigen-mode angular frequency. The equilibrium energy of the 
antenna at the nth resonance is given by 


#2 (Gun) 
E= (18) 


This is the saturated vibration energy due to a monochromatic continuous gravita- 


tional wave. For burst wave signals, the energy deposited in the antenna is effectively 
described by the antenna cross-section and the antenna directivity pattern.2° When 
a burst of un-polarized wave propagating in the direction n(nz,ny,nz) impinges on 
the antenna, the deposited energy is 


1 
/ (lit? + u2|ul2)aV, 


Eaeposited = 2 


3 


= TO Mv? AGO (ne, My: Re) FY), (19) 


where M is the mass of the antenna, v is the frequency of the gravitational-wave, 
Fv) is the energy spectrum density of the burst wave, and Ag is the antenna cross 


section, given by 
2 
2 » Tov 


Ag= : (20) 
M [ plwPav 
Also, © is the antenna directivity pattern, calculated by 
1 ( 2 1 2 
ri dapnans) ee a3 = daBdayNgny 
O(nz, Ny, Nz) = 4 » 2 yy yy : (21) 


5a 
5 ow 108 
Narihara®® gave numerical values concerning a square antenna, as sketched in 


Fig. 3, where the mass is M=1400kg, the length is equal to the width, 1.65m, and 
the thickness is 0.14m. Ag is 0.77m? for an isotropic source 


Eaeooitea = 3:6 B10" Fm). (22) 
The directivity pattern of the antenna is calculated in polar coordinates 
5 65 5 
00,4) =5— 5sin’O + ; sin* 0 cos? ¢. (23) 


The response by a disk antenna was similarly given by Paik.°° 
In the late 1990s, the merit of pursuing a spherical antenna was emphasized and 
practically analyzed by Johnson and Merkowitz,?’ although the spherical antenna’s 
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Fig. 3. An deformation image of the square-type antenna that was developed by Hirakawa et al. 


enhanced gravitational cross-section was recognized in the early 1970s with its abil- 
ity to measure signal direction and the polarization.**® Although the bar antenna can 
detect only one quadrupole mode, the sphere has five degenerate ones that interact 
strongly with a gravitational wave. Each modes act as a separate antenna, being 
oriented toward a different polarization or direction. This merit had already been 
mentioned by Wagoner and Paik,?? where the improvement in the cross-section is 
about a factor of 60 compared to a bar with the same quadrupole mode frequency 
and a typical length/diameter of 4.2. 

Prototypes of this sphere detector were made at both LSU and Leiden University 
(MiniGRAIL as shown in Fig. 4),4° and its development by Schenberg produced a 


Fig. 4. Prototype sphere detector, MiniGRAIL in Leiden University. 


Source: Photo is taken from http://www.minigrail.nl/. 
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spherical detector in Brazil.4! All of the above resonant detectors were expected to 
catch burst waves. 


1.4. Response of an interferometer 


It is easier to consider metric perturbation when a laser interferometer responds 
to a gravitational wave. The basic interferometric detector consists of three main 
optical components that form the so-called Michelson interferometer, as shown in 
Fig. 5. 

The beam-splitter splits the light beam to both “arm” directions of the z-axis 
and y-axis. The reflected beams are also split at a beam splitter, and the beam 
combined toward the output photo-detector experiences interference reflecting the 
phase shift due to the path difference. For simplicity, a wave of “+” polarization is 
considered here. During passing of the gravitational wave, the metric perturbation 
causes a change in the speed of light travelling from the beam splitter to the mirror, 
and the returning process. Since the light propagation is represented by ds? = 0, 
the velocity, c,, along the z-axis is calculated from Eq. (5) as 


oe 
Sal hl), (24) 
In a similar manner the velocity, cy, along y-axis is given by 
cy = e(1 + Ait)-3. (25) 
3 
Wee, s 


Mirror Y 


Laser Photo detector 


Beam Splitter 


Fig. 5. Michelson interferometer consists of three main components; beam splitter, a mirror 
on the z-axis and a mirror on the y-axis. Gravitational wave comes from the zenith with “+” 
polarization in accordance with z- and y-axis directions. In T'T-gauge description, the apparent 
speeds of light in the interferometer arms alternatively change according to the wave propagation, 
which creates interfered phase shift at the combined beams at the beam splitter. 
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If the arm lengths (distances between the beam splitter and the z-axis mirror, 
or the yaxis mirror) are the same in the proper reference frame of the Michelson 
interferometer, the coordinate positions in the TT gauge corresponding to these 
positions do not change due to a gravitational wave. This means that the velocity 
difference experiences a phase shift, Adgr, at the point of the interfered light at 
the beam splitter, which is easily obtained by 


Ader = =" (Loh), (26) 


where 4 is the wave length of the light, Lo is the arm length, and h is the amplitude 
of the gravitational wave (h = hi? = hi hs This phase shift is detected by a photo 
detector catching the output beam from the beam splitter. 

The response due to a gravitational wave was first evaluated by Russian sci- 
entists*? and was experimentally tested by Forward.*? Initially, there has been a 
debate about how interferometers respond to gravitational waves.44 

The above signal output of the interferometer is only valid when the frequency 
change of the gravitational wave is slower than travel time of the light inside the 
arms. In order to correctly handle the frequency spectrum of h, we assume that 


= f h(w)e*dw. Using Eq. (26) 


Ager(t) = 5 a (i yf A at dus (27) 


Q ie . 
= 3 Jere _ eee radia (28) 
al e7 2iwt/c 
=Q el” ey, | 9 
[re Qiw ra ay) 
which is represented by 
Adgar(t) = / h(w)e™* Hy (w)dw. (30) 
The response function becomes 
Q lw 
Hy ees —iwe/c 31 
u(w) = sin( = - Je ; (31) 


which takes maximum value for — = 7/2. For a gravitational wave of 1 kHz, the 
optimum is obtained by ¢ = 75km.*? The frequency response of a simple Michelson 
interferometer is shown for the two different baseline lengths in Fig. 6. 


1.4.1. Directivity 


In the previous section, we obtained the response of a Michelson interferometer to 
an incoming gravitational-wave travelling as shown in Fig. 5, the direction of which 
gives the maximum signal. In general, the direction of the incoming gravitational- 
wave is not necessarily aligned along the optimum direction of the interferometer. 
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Fig. 6. Frequency response of a simple Michelson interferometer. 


The directivity sensitivity of the interferometer is obtained by 


1 24 
oh | + cos 


cos 2¢ cos 2W) — cos @ sin 2¢ sin 20] 


1 2 
22 | + cos? 6 


cos 2¢ sin 2y + cos @ sin 2¢ cos 20) (32) 


where @ and ¢ are the direction angles toward the source of gravitational wave and 
w represents the angle of the polarization of the gravitational wave. hj; is defined 
by the following equations: 


0 O 0 O 0 O 0 0 

—_ O hit hig O _ O hit hig O (33) 
O hay hee O O hoy —hiy O 
0 O 0 O 0 O 0 0 


If gravitational-wave events occur uniformly in the universe, and there is no bias of 
the orbital direction, we can calculate the average detection rate by an interferom- 
eter located on the Earth using the above result. Since the squared average of both 


o@ and @ becomes 2, the average in total is # considering the squared average of 


the polarizations (5). Hence, if gravitational-wave sources are binary coalescences, 
we have to consider the direction of the binary orbit, which is practically calculated 
by allocating a random direction of the orbit in a computer simulation. Considering 
all the above, the sensitivity becomes 0.44-times the optimal one. 
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1.4.2. Positioning 


Since gravitational waves pass across the Earth, the detectors sense from both direc- 
tions, from the sky and the underground. The wave propagation line is determined 
by knowing the exact arrival times of three detectors and the direction is fixed 
by the fourth detector. However, since we can utilize information from the wave 
polarization, we need only three detectors in practice. 

Three interferometers on Earth are assumed to be directed as nj,n2 and n3. 
Let us take a coordinate system 2, y, z, where a gravitational wave is passing Earth 
along the z-axis. n; coincides with the z’-axis of a coordinate system, where its 2’- 
axis and y’-axis are along both arms of the detector 7, respectively. The coordinate 
transformation of L, where rotating the coordinate system along the z’-axis in order 
to make z’-axis parallel to the xy-plane by ¢1, and rotating in order to make the 
z'-axis to be in accord with the z-axis along its rotated 2’-axis by 01, and finally 
rotating by uw along the z-axis in order to make the z’-axis accord with the z-axis, 
becomes the inverse transformation from vector basis of the detector coordinate to 
the vector basis of the wave, where the wave propagation direction coincides with 
the z-axis. Shortly, for the space components, 


hag = h(Leq, Leg) = LthagL, (34) 


where L is a transformation tensor of the vector basis of the coordinates, the output 
signal of the interferometer S$;(¢) is proportional to hijdetector — h22detector: 


29. 2 
oe a cos 2d; cos 24}; — cos 8; sin 26; sin 20 hi Gea 
Cc 


" t — Rcos6; 
— | FS 00826, sin 20% + 086 sin 24, 008204] hn ( ASE), 


(35) 


Let us assume here that all detectors are correctly calibrated, and that the terms 
of hy, and hy, are all the same, except that its phase is delayed. In the above, the 
gravitational wave was assumed to propagate along the z-axis direction. However, if 
the direction is different by some amount, the directions of all detectors are different 
by the same amount. We take this discrepancy as an evaluation parameter. For a 
given value of this discrepancy, the coefficients of both hy; and hy are fixed in 
the above equations of 5; and S2, and we can solve analytically or by fitting those 
equations of S; and S$» for hy, and hy. Also, solved values are set in the third 
equation of S3. If the direction differs from the true value, the third equation may 
not hold. If the difference becomes minimum, the direction of the gravitational 
wave is the most probable. In order to efficiently obtain the result, we evaluate the 
equation: 


A = ($1 — [*])? + (S2 — [#])? + (Sa — [])?, (36) 
where [x] is the iterated values corresponding to S;. For example, we can apply 
a matched filter technique in this analysis. Each interferometer has directions to 
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whom they are not sensitive, so if the source direction is close to this, the accuracy 
of determination may not be good. In this case, one more detector is desired to 
augment the observation network. 


1.5. Comparison of a resonant antenna and an interferometer 


The signal frequency bandwidth of resonant antennas is at most few tens of Hz, 
while even first-generation interferometric detectors is a bandwidth of several hun- 
dreds of Hz, which is sufficiently wider to cover the whole final phase of the coa- 
lescence of binary neutron stars (from several 10 Hz to several 100 Hz). Therefore, 
resonant antennae are only applicable to observe burst waves. However, before the 
1980s, just a few researchers believed that laser interferometric gravitational-wave 
detectors could become dominant tool on the Earth, because the author consid- 
ers that a large frequency bandwidth requires anti-vibration system with excellent 
performances. A thermal noise-limited resonant antenna was realized at cryogenic 
temperature in 1980, but the sensitivity approaching thermal noise by interfer- 
ometers was much later. In 2005, when LIGO reached its design sensitivity, ther- 
mal noise of main optics was regarded to limit the sensitivity in mid-frequencies 
with a combination with other noise sources.*° The researchers developing reso- 
nant antenna reached a deadlock after achieving thermal noise sensitivity, and had 
to devise both quantum noise evading theory*® and technology.*”** In general the 
effort spent in developing the detector reliability was significantly higher concerning 
cryogenic resonant antennae than interferometric detectors until the construction 
of LIGOS started. 

A comparison of a resonant antenna and an interferometer helps us to catch 
physical priority of the interferometer from the point of view of energy amplifica- 


tion.*9 


2. Resonant Antennae 


The announcement by Weber was disproved by several groups, as stated in the 
previous section. All antennae were operated at room temperature, and almost all 
detectors adopted a piezo-electric transducer. Based on these experiments, Giffard 
studied the ultimate sensitivity limit of a resonant gravitational-wave antenna using 
a linear-motion detector®"; that is, the sensitivity given by the minimum effective 
temperature is twice as much the noise temperature of the amplifier; back action 
evading and quantum nondemolition schemes were introduced later. The paper 
of Giffard accelerated the R&D on cryogenic resonant antennae, called second- 
generation antennae. Note that researchers who were developing resonant antennae 
used the minimum effective temperature for presenting the sensitivity. The sen- 
sitivity was shown by the unity signal-to-noise ratio. During the development of 
second-generation detectors, this custom had been widely used. Technically, in order 
to attain the minimum effective temperature of the antenna, impedance matching 
is needed between the impedance of the mechanical resonant body and that of the 
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transducer system. Hence, reducing the effective temperature requires both cryo- 
genics and impedance-matching that lead to multi-mode resonant antenna. 

The development of this second-generation antennae had been conducted since 
the 1970s, and was almost completed by the 1990s, when large-scale laser inter- 
ferometers were planned to be constructed. In the attempt to lower the effective 
temperature and to enhance their sensitivity, resonant detectors were cooled down 
to liquid helium temperature (4.2 K) and some of them even below (the super-fluid 
phase, 2.17K).°1 


2.1. Development of resonant antennae 


The cryogenic bar antenna system is sketched in Fig. 7. 

First, two-mode cryogenic bar detector was constructed and operated at liquid- 
helium temperature at Stanford University in 1977. The vibration mode of the 
680kg aluminum bar was amplified by a small mass supported by a niobium 
diaphragm, the movement of which was sensed by flat superconducting pick-up 
coils facing its two sides, and converted into magnetic field detected by a magne- 
tometer using an rf-superconducting quantum interference device (SQUID).°? The 
average vibrational energy in the lowest longitudinal mode at 1315 Hz was consis- 
tent with the level of thermal noise at the antenna temperature. This became the 
basis of a 4800kg cryogenic antenna,°®® which started its operation in 1980. Its res- 
onant frequency was designed to be 830 Hz. This detector was damaged by a 1989 
Earthquake and was shut down. 

A torsional antenna was developed in Tokyo for low-frequency gravitational- 
waves.°* High mechanical Q of aluminum alloy (5056) was originally found at cryo- 
genic temperature by this research group.®° A series of CRAB experiments were 


af. 
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Fig. 7. Sketch representing key features of a resonant antenna system; the bar is suspended 
through the node of the lowest longitudinal oscillation mode, and a small mass connected to the end 
of the bar, where the resonant frequency is set to equal the bar frequency. The electromechanical 
transducer converts the displacement of the small mass to an electric signal. The whole system is 
in a vacuum chamber, and is cooled down to cryogenic temperature. 
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conducted at KEK (High Energy Research Organization) in Tsukuba, Japan, for 
observations of continuous waves at around 60 Hz from the crab pulsar. In the end, 
a 1.2 tons torsional antenna was developed.*® Since the resonant frequency is quite 
lower than any other resonant antennas that were developed to detect burst waves 
from supernovas, the vibration isolation system required to be more sophisticated. 

Explorer was developed by Rome group, and was first operated in 1986 (Fig. 8). 
The antenna was made of a high-Q alloy, Al-5056, and had a mass of M = 2270kg, 
3m-long. Its fundamental mode frequency was around 900 Hz. A resonant capac- 
itive transducer was mounted, which was followed by a dc-SQUID amplifier. The 
operation was made at T’ ~ 2.6K in a cryostat cooled with super-fluid helium, 
which was able to remove acoustic noise due to the boiling of liquid helium. It was 
operated for observations for more than 10 years, since 1990 after achieving good 
duty cycle.°” 

ALLEGRO was a cryogenic bar detector at Louisiana State University.°> The 
bar was a cylinder of aluminum alloy (5056; 60cm in diameter; and 300cm in 
length). The mass was 2296kg and its lowest longitudinal normal mode was at 
913 Hz. The longitudinal vibration was amplified by a smaller mushroom type res- 
onator attached to one end of the cylinder, the displacement of which was sensed 
by an inductor with a persistent current of 10A. This current was amplified by a 
dc-SQUID. It was operational from 1991 until 1995. 

NIOBE achieved its successful operation in 1995 at University of Western 
Australia in Perth (Fig. 9).°° It operated at about 5K, and consisted of a 1500kg 
Nb bar with a fundamental resonant frequency of 710 Hz. A bending flap weighing 
450 g, which was attached to the end of the bar, amplified the vibration of the bar. 
The resonant frequency of the flap was 700 Hz, and the coupled frequencies were 


Fig. 8. Explorer resonant antenna in Rome. (For color version, see page I-CP11.) 


Source: Photo is reprinted from http://www.romal.infn.it /rog/explorer/. 
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Fig. 9. NIOBE resonant antenna at Perth in Western Australia. Antenna body of Nb is housed 
in the inner most radiation shield. 


Source: Photo courtesy of David Blair. 


713Hz and 694Hz. The vibrational state was monitored by a superconducting re- 
entrant microwave cavity whose capacitance was modulated by the relative motion 
of the bar and the bending flap. Thus, this transducer scheme, parametric, was dif- 
ferent from linear schemes adopted elsewhere. Possible parametric instability was 
suppressed by a cold damping technique that controlled the carrier power. 
NAUTILUS was designed to achieve mK operation, and built at the Frascati 
INFN Laboratories (Fig. 10). First cooling was achieved below 0.1 K by the resonant 
antenna of the Rome Group.°! In the second observation run from December, 1995, 
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Fig. 10. NAUTILUS cryogenic resonant antenna at Frascati, Rome. 


Source: Photo is reprinted from a presentation file under the permission of EF. Coccia. 


to December, 1996, continuous observation was conducted at T = 0.1K except for 
maintenance breaks (85% thermal duty cycle achieved). 

AURIGA was built as a twin of NAUTILUS and first operated at a cryogenic 
temperature of several hundred mK since 1995 until 1996. The antenna was located 
in Legnaro in Italy (Fig. 11) and its body was made of aluminum alloy (5056) 
and its mass was 2300kg. It was equipped with a capacitive transducer coupled to 
an internal SQUID amplifier. The lowest temperature, 140 mK, was achieved by a 
3He —* He dilution refrigerator. In the second run, starting in 1997, the operating 
temperature of the bar and transducer was lowered to about 90mK, and kept at 
about 200 mK.®! 


2.2. Dynamical model of a resonant antenna with two modes 


In order to achieve the ultimate sensitivity of a resonant antenna, impedance match- 
ing can be realized by adding a small-mass resonant system to the antenna bar. The 
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Fig. 11. AURIGA cryogenic resonant antenna is the twin of NAUTILUS, which is placed at 
Legnaro in Padova. (For color version, see page I-CP11.) 


Source: Photo is reprinted from http://www.auriga.|n].infn.it. 


<— Transducer+Amplifier— 
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Fig. 12. Dynamical model of a resonant antenna with a resonant transducer. Almost all resonant 
antennae had this kind of two mode system, which made possible to realize impedance matching 
between the antenna bar and the transducer. The dynamical response of the antenna has two 
resonant modes, upper frequency and lower frequency. For example, ALLEGRO had 896.8 Hz and 
920.3 Hz, and Explorer had 904.7 Hz and 921.3 Hz. 


small-mass system amplifies the displacement of the bar. The dynamical system of 
the two mode antenna is schematically shown in Fig. 12. 
The equations of motion of this model are 


Myx, (t) + (ual (t) + ky (t) - C22 (t) _ kgao(t) 


= fall) = folt) + fol) + FMiLaheo(0) (37) 


Mp%o(t) + Mi%1(t) + Cota(t) + kere(t) = fo(t) — fr(t). (38) 
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Here, MM, is the mass of the bar and Mg is the mass of the mushroom-shaped res- 
onator. L, is the effective length of the bar. kj and kg are the spring constants 
of the bar and the mushroom resonator, respectively. ¢; and C2 are the damping 
coefficients of those parts. f; and f2 are the Langevin force noises generators asso- 
ciated with the dissipation coefficients of each mass. fr is the reaction force from 
the transducer and amplifier due to the fluctuating magnetic pressure in the super- 
conducting pick-up coil. The antenna of Stanford University also adopted a similar 
system where the change of the magnetic field was directly sensed by a coil and 
coupled to an ac-SQUID to sense the current change. However, Explorer adopted 
a capacitance change to extract the current signal with a dc-SQUID. In any case, 
a back-action force exists, and dictates the performance of the transducer. The 
output of the transducer is represented by 


Vout (t) = Ga(t) + n(t), (39) 


where G is the gain factor and n(t) is the white noise from the amplifier. 

The dynamical response of the antenna has two resonant modes of upper fre- 
quency and lower frequency. For example, ALLEGRO had 896.8 Hz and 920.3 Hz. 
Explorer had 904.7 Hz and 921.3 Hz. 


2.3. Signal-to-noise ratio and noise temperature 


The signal-to-noise ratio (SNR) of a resonant antennae is optimized for burst-wave 
detection. As long as the signal consists of sufficiently short pulses, the detection 
procedure does not depend on the exact pulse shape. The purpose of the filtering 
that is applied to the output of the detector is not to reproduce the signal form, 
but to make decisions concerning the presence or absence of the signal in a reliable 
manner, and to determine the strength and arrival time of the signal. This is mostly 
different from the objective of the data analysis of interferometric detectors. 

The voltage of the output of the transducer is sent to a single lock-in amplifier, 
which demodulates the signal and acts as a low pass filter. The reference frequency 
is set halfway between the normal-mode frequencies of the antenna, which converts 
the frequency of the normal mode to low frequency. The demodulated signal is 
represented in a two-dimensional phase diagram. The state of the signal in terms 
of energy, E(t), is given by 


E(t) = «°(t) +y?(). (40) 


In the absence of any transducer noise, the quantity of E(t) is proportional to 
the vibrational energy in the antenna mode, which is exp(-E/kpT), where kp is 
Boltzmann’s constant. At room temperature, the vibration amplitude is dominant, 
even if the transducer noise is included. If the antenna is cooled down to a cryogenic 
temperature, the noise of the transducer is not non-negligible. For short pulses of 
gravitational waves, any change of the signal power, AE(t), is a useful index to 
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measure the sensitivity of the antenna 
AE(t) = [x(t) — a(t 7)? + [y@) — y(t -7)P, (41) 

where 7 is the time window sufficient for covering the bandwidth of the target pulse 
and limiting the noise of the transducer. There is an optimum value of 7, which is 
determined by a balance between Brownian noise of the bar and the electric noise 
of the transducer. A practical 7 ranges from 0.3-1.0s. The average (AF) represents 
an impulse energy, and has a mean value corresponding to a temperature that is 
determined by both the Nyquist noise of the antenna and the transducer noise, 
which is much smaller than room temperature. This is because the fluctuating 
excursion of a state in the phase diagram is less than 
= kpT Te. 
~ Mu? ta’ 
which is smaller by the extra factor defined by the last fraction 7/74, where T, is 
the relaxation time of the antenna mode vibration. This means that the fluctuation 
amplitude can be reduced by the low-loss mechanical Q of the antenna material. 
The reduction factor is affected by the effective noise temperature. The observation 
of mechanical Nyquist noise in the cryogenic bar antenna was clearly shown in 
Ref. 62. In order to achieve the optimum sensitivity of a given resonant antenna 
system, impedance matching of the antenna with the transducer is needed. 

It is important to know how reliably the kick amplitudes of an incoming gravita- 
tional wave exceed thermal excursion during the time window. Lower the excursion 
amplitude, the more reliably those kicks from thermal fluctuations are identified. 


(Ax? (r)) (42) 


2.4. Comparison of five resonant antennae 


Since the first cryogenic operation in 1980, five resonant antennas had been operated 
until early 2000. Every antenna achieved its own world record during the operation. 
The performance is described mainly by its strain noise spectral density, its duty 
cycle, the antenna pattern, and a nonstationary “background” noise in excess of 
the model. The strain noise spectral density, S;,(fo), is given by 


TT kpT 
S = -— 43 
nto) = Sangre (43) 
where rv, is the elastic sound velocity of the bar material. The frequency bandwidth 
of the detector is given by the full width at half maximum of the resonance 


_ fo —1/2 
of = ot (44) 


where I is 


r wide band noise in resonance (45) 
narrow band noise , 


I decreases if the noise temperature decreases, and when the efficiency of the trans- 
ducer becomes large. Using the above strain noise power spectrum, the minimum 
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Table 1. Summary of resonant antennae. 


ALLEGRO EXPLORER NIOBE NAUTILUS AURIGA 


Bar Working Temp. [K] 4.2 2.6 5 0.13 0.25 
Mechanical Q factor 1.5 x 10% 2 x 10° 20 x 108 0.5 x 108 3 x 10° 
Strain noise S}/? [Hz-!/2] 10x 10-22 6x 10-22, 8x 10-22, 2x 10-22, 2x 10-22 
Eff. Bandwidth Hz 1 0.2 1 0.6 0.5 
Eff. noise temp [mK] 10 10 3 2 2 
Burst strain sens. hmin 8 x 10-19 8 x 10719 10 x 10-19 4x 10719 4x 10719 
Duty cycle 97% 75% 75% 75% 75% 
SNR > 5 rate [day~1] 100 150 75 150 200 


Note: Since the first cryogenic operation in 1980, five resonant antennae had been operated until 
the early 2000 with intermittent observations. NAUTILUS and AURIGA are being operated in 
2015 for covering the lack of observation by under construction interferometers. 


detectable gravitational-wave amplitude (SNR = 1) for short bursts is described by 


where Ty is the time duration of the gravitational wave. 


Five resonant antennae are summarized in Table 1.6 At the beginning of 
the operations by interferometric gravitational-wave detectors, all of the above- 
mentioned resonant antennae have been shut down, except for NAUTILUS and 
AURIGA which are both in operation (astro-watch*) and their operation is sup- 
posed to stop during the current year. 


3. Interferometers 


The development after Forward was initiated by several prominent researchers: J. 
Hough in Glasgow University, R. Weiss in Massachusetts Institute of Technology 
(MIT), Schilling et al. in Max Planck Institute in Garching, and R. Drever who 
moved from the Glasgow University to California Institute of Technology (Caltech). 

The development of techniques involving interferometers is characterized by 
three stages in time. The first stage was the era of prototype interferometers 
led by the Garching 3m-long and a 30m-long delay-line interferometer.6*® The 
Glasgow 10m®® and Caltech 40m®’ ones belong also to this category. ISAS® 
10 m-interferometers®* and 100m one® succeeded the technique developed by the 
Garching 30m one. At this stage, the basic concept of the interferometer was 
formulated so as to remove any technical noise sources such as the laser ampli- 
tude and frequency noises, the mirror suspension subsystem, and scattering-light 
noise. The knowledge and experiences accumulated by those R&D efforts were 
utilized to design the next stage of the first-generation large-scale interferometers 


®Gravitational Wave International Committee (GWIC) recommends the operation during the 
absence of operations of large-scale interferometers as astro-watch. 

bInstitute of Space and Astronautical Science, later from University of Tokyo to JAXA, Japan 
Aerospace Exploration Agency. 
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with sensitivity limited only by the fundamental noises. Under the condition where 
technical noises are well suppressed, noises of the interferometers consist of photon- 
counting noise, thermal noise originating from mirrors, including the suspension sys- 
tem, seismic noise at low frequencies in the first-generation interferometers.2:9:70:7! 
At the beginning stage of the installation of the first-generation detectors, there 
was no standard estimation method to assess the SNR for compact binary coales- 
cence. During the operation of the first-generation detectors, almost all fundamental 
noise limits were reached, except for a few new noise sources, which were uniden- 
tified noise sources, as up-conversion noise and electro-static charge noise affect- 
ing the sensitivity at low-frequency region (around 40 Hz).”? In this second stage, 
optical-coating thermal noise became close-up and vacuum squeezing injection was 
tested. At the time of outlining this paper, the construction and installation of the 
second-generation detectors are ongoing. This is the third stage, where new optical 
configurations are considered, and techniques to reduce any mechanical loss of the 
optical coating is urgent concern.’? Newtonian gravity-gradient noise is expected 
to affect the second-generation detectors in the low-frequency range, where the 
first-generation detectors could not achieve good sensitivity. 

The above three stages roughly correspond to three stages of fighting against: 
(i) technical noises, (ii) thermal noises, and (iii) quantum noises. The advancement 
of techniques is quite rapid, and is drastically changing. Noise handling in the stage 
where first-generation detectors were designed is not correct any more. For example, 
thermal noise based on velocity damping is being replaced by structure damping. 
There have been many good review articles at hand through the Internet concerning 
interferometric gravitational-wave detectors since the 1980s.’* The most recent one 
is by R. Adhikari.” In this section, the kinds of noise limits to the sensitivity and 
related techniques to reduce them are described. 


3.1. First stage against technical noises in prototype 
interferometers 


3.1.1. 3m-Garching interferometer 


In the earlier stage of the laser-interferometer development, the optical configura- 
tion was that of Michelson interferometer with multi-bounce delay-line arms. This 
kind of delay-line Michelson interferometer was developed at Max-Planck-Institute 
fur Quantenoptik at Garching near Munchen, the Massachusetts Institute of Tech- 
nology (MIT) and at the Institute of Space and Astronautical Science (ISAS) in 
Tokyo. The achievement of this prototype interferometer was to successfully show 
that the shot-noise level at frequency higher than 1kHz, and to study the fea- 
sibility of a laser interferometer that can attain the sensitivity required to detect 
gravitational-wave events occurring at the distance of Virgo cluster, which was esti- 
mated at that time to be L/L = 10-7. The baseline length of the interferometer 
was only 3.2m, and the reflections were 46 between mirrors. The laser source was 
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an argon-ion laser of 1.5 W, the efficiency of which was not so good as that presently 
used, a Nd:YAG laser of infra-red light (1 wm). 

Photon shot noise arising from the quantum nature of photon limits the sensi- 
tivity of laser interferometers. For catching a concrete image of the noise, the author 
would like to take an example of fluctuation being accompanied with the J = 1 pA 
current. This is a typical amount of gate leakage current in a low-noise Field Effect 
Transistor (FET). The magnitude of the noise current is represented by its power 
spectrum, 


(i2) = V2el, (47) 


which is ~ 5x107!6 A/VHz, where ¢ is the electron charge. If this current, J = 1 pA, 
is created in a photo-detector by a photon, a photon quanta of 6 x 10° per second 
should be put into the photo-detector, 6 x 10° per second should be read-out, 
assuming 100% efficiency. This is equivalent to a power of 1.8 x 107!® pW if the 
light frequency is 500 THz. According to Poisson’s statistics, N quanta fluctuate 
by VN, and if the quanta consists of electrons, the fluctuation produces a noise 
current. 

The minimum phase observed by the Michelson interferometer that creates the 
photon current with a fluctuating noise current is given by assuming ¢, — ¢2 = 


0 as b¢, 
6l = =8 sin do - 6d, (48) 


where Jp = Imax — Imin. Equating this with the above noise current, 


ae 


Lids =a) 


where I = = (Lea + Imin + Ip cos éo), and the equality holds at the ideal condition 
of Imin = 0. The minimum depends on the value of ¢9, and is obtained at the 
condition that ¢9 = 7, where the beams destructively interfere, and the output 
of the photo-detector is null. In this case, the output signal cannot be extracted 
without a modulation method (see Appendix B). Even if the modulation method 
is applied to extract the signal, the minimum condition of noise becomes the same 
as that given above. 
The minimum detectable phase of the Michelson interferometer is given by 


Odmin = (49) 


2hQ 
ddmin = NP’ (50) 
where Imax = ewe. Since the output of the signal is proportional to P, the signal- 
to-noise ratio is in proportion to 1/ VP, which means that a high-power laser is 
required to attain high sensitivity. 
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In the prototype experiment, the quality of the laser source (frequency & ampli- 
tude noise and beam-jitter noise) was studied and its remedy was presented.®* The 
concerning factors turned out to be: 


(i) Intensity variation that induces noise with the fluctuating deviation from the 
operating point, 
(ii) Frequency variation that produces the absolute path difference, 
(iii) Lateral jitter that couples to the mirror tilts, which produces a false displace- 
ment, 
(iv) Pulsation in width that creates a curved wave front difference. 


Remedies to reduce the above noises led to the achievement of the sensitivity of 
the photon shot noise level, and it is impressive to realize how a study concerning 
the stray-light problem in this rather earlier prototype interferometer helped in 
designing second-generation detectors. 


3.1.2. 30 m-Garching interferometer 


Based on the previous development of the 3m-interferometer, a longer scale 
delay-line interferometer was developed in Garching,® which formed the basis of 
GEO 600.”° It was a 30m baseline interferometer with delay-lines along the arms 
using green light from an argon-ion laser (Fig. 13). The operating point was set at 
the dark fringe using internal modulation scheme (refer to Appendix B). 

The laser frequency was stabilized both by a reference cavity and the arm length 
change. The end mirrors were suspended, and the central part of the beam splitter 
and input mirrors were fixed onto a suspended mass together. Figure 14 shows the 


delay lines 
(N=4 shown) 


Argon-ion laser 


reference 
cavity 


Fig. 13. Schematic view of Garching 30m interferometer. 
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Fig. 14. Development of the suspension system from the Garching 30 m prototype (a) to the triple 
suspension system of the GEO interferometer (b). The advanced structure looks complicated, but 
it is simpler from the point of view of thermal noise. 


development of the suspension system from the Garching 30m prototype to the 
main suspension of the GEO interferometer,’ where a triple pendulum structure 
is used in place of a single one. The advanced structure looks complicated, but it 
is simpler from the point of view of thermal noise. The important point is that the 
Garching 30m prototype established the suspension, the heart of which prevails all 
over the world. In this prototype, both the thermal noise and the transfer function 
of the mirror suspension system were studied. However, only the thermally excited 
peak corresponding to the lowest modal oscillation of the mirror was observed, and 
the thermal noise of the slope was far less than the shot noise and seismic noise. 

The behavior of seismic noise is determined by the ground continuous vibration. 
The amplitude of the seismic noise usually decreases with the frequency and its 
power spectrum density looks like as shown in Fig. 15. It is not easy to find the 
quiet place being represented by the low-noise model in this figure. The amplitude 
below 1Hz largely depends on how strong the crust of the Earth is excited by 
external forces, due to wind and sea, atmospheric and ground surface waves. 

A survey, recently performed, in the context of site selection for third-generation 
interferometric detectors provides an overview on this subject as shown in Fig. 16. 
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Fig. 15. Square root of the power spectrum of typical seismic noise that affects the sensitivity of 
interferometers. 


Source: This figure is reproduced from Ref. 77. 
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Fig. 16. A survey, recently performed, in the context of site selection for third-generation inter- 


ferometric detectors. This figure shows the power spectrum of high/low change at the KAGRA 
and Virgo sites. 


Source: The figure was taken from the doctoral thesis of M. Beker (see Ref. 78). 
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The square root of the power spectrum of the seismic vibration of a ground 
floor, G,, is expressed by a widely known formula 


VG. = 10-7/f?{m/VHz]. (51) 


This formula can be applied to both directions of horizontal and vertical. It is 
natural that the amplitude changes both place by place and time by time, where a 
micro-seismic noise peak is seen at between 0.1 Hz and 1 Hz at any place on Earth. 

The basic principle of the vibration isolation is based on a mechanical pendu- 
lum or an oscillator. For example, a single pendulum of 25cm long (1s of pen- 
dulum period) can achieve a suppression of 10~* at 100Hz under the assumption 
of the ideal point mass pendulum. The suspension system of the Garching 30m- 
interferometer was a simple pendulum, which is the basis of the suspension system, 
as its development shows in Fig. 14. 

The noise source limiting the sensitivity in the observed power spectrum was 
explained by the estimated photon shot noise while considering the modulation and 
transfer function of the suspension system, which was developed in this interferome- 
ter. The best sensitivity was a strain h of 3x 10718 in a 1-kHz bandwidth. Although 
scattered multi-beam light pushed the noise level up, the fundamental technique to 
realize a highly sensitive laser interferometer was acquired at this stage.’4 


3.1.3. Glasgow 10 m-Fabry—Perot Michelson interferometer 


The basic idea of utilizing Fabry-Perot cavities arose from a proposal and exper- 
iment by Drever.”? The Fabry—Perot cavity can enclose light between two facing 
high-reflectivity mirrors (refer to Appendix C). A Michelson interferometer was 
developed with a Fabry—Perot cavity in each arm at Glasgow. The baseline length 
was 10m, using a cw argon-ion green laser. Figure 17 shows a schematic view of 
the Glasgow 10m Fabry—Perot Michelson interferometer.® Different from the sim- 
ple Michelson interferometer, returning beams from both cavities were not directly 
recombined, but electrically subtracted after being converted to electrical signals 
through photo detectors. This optical configuration arose in order to stabilize the 
laser source frequency by using one of Fabry—Perot cavities, while other cavity 
responses to incoming gravitational waves. Also, it could reduce the number of 
feedback systems. Successful direct recombination of the Fabry-Perot Michelson 
interferometer was first demonstrated in mid of the 1990s.8° The output of the pri- 
mary arm was fed back to adjust the laser frequency so as to keep the resonance of 
the cavity, the subtracted output was used to actuate the mirror of the secondary 
cavity to keep its resonance. The feedback signal had an information about any pos- 
sible gravitational wave. This optical configuration is called a locked Fabry—Perot 
cavity interferometer. A low-loss optical coating was applied to obtain the mirror of 
the Fabry—Perot cavity. A multi-loop feedback system was applied to stabilize the 
frequency of the laser; the amplitude was controlled by another feedback system. 
The beam-splitter was mounted on a center plate that was suspended by a single 
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Fig. 17. A Fabry—Perot Michelson interferometer at Glasgow University. The primary arm was 
set in resonance by frequency feedback to the laser, and the secondary arm was kept in resonance 
by fed-backed actuation of the rear mirror, the actuation signal of which reflects the gravitational- 
wave signal. It was called a locked Fabry—Perot cavity interferometer. 


pendulum, and input mirrors were also suspended by single pendulum systems. The 
end mirrors were applied by double pendulum suspension systems so as to improve 
the low frequency noise performance. 

As deduced in Sec. 3.1.1, the noise of a Fabry-Perot Michelson interferometer is 
obtained by equating the photo-current corresponding to a gravitational wave and 
the shot noise,®! if the effect of modulation is neglected. Taking 7 as the detection 
efficiency of the photo-detector, 


|Hpp|oh = ,/—. (52) 


From this formula, fh in units of the band frequency is represented by 


Al1l+Fsin? a 
w(1 — rire) Cc (53) 
. (=) 20QnPo : 
sin{ — 
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which reduces to 


All + (7T.w)?] 
20.n Por? 


Amin xy (54) 
if wl/e< Landry < rz or 1—r; < 1 holds, where the light travel time is T; = 2e FF. 
A formula considering modulation is given in Ref. 66. 

The achieved sensitivity by Glasgow 10 m-interferometer was ~ 7 x 10~?°//Hz 
from 500 Hz to 3kHz, which is better than Garching 30 m-interferometer. 


3.1.4. Caltech 40 m-Fabry—Perot Michelson interferometer 


The Caltech 40m prototype was a locked Fabry—Perot cavity interferometer suc- 
ceeding the configuration of Glasgow 10 m.® In the initial operation, which ended 
in 1992 (Mark I), the shot-noise level was achieved at higher frequencies. On the 
other hand, seismic noise leaking through isolation system dictated a steep increase 
of the noise spectrum at lower frequencies. In the mid-frequency region between 
these, unidentified noise prevented from reaching thermal-noise level, except for 
peaks of the violin modes. By a revision of the isolation stacks and the test mass, 
the noise spectrum of the mid-frequency region was improved as shown (Mark IJ) in 
Fig. 18, which compares with the sensitivity of the initial LIGO design in terms of 
not by strain sensitivity but by displacement one. The test mass was made of fused 
silica housing a mirror with optical contact. The basic difference between the initial 
LIGO and the Mark II stays in this point. Without a complete understanding of 
unidentified noise of the Mark II, the initial LIGO was constructed and operated. 


50 100 1000 5000 
Frequency (Hz) 


Fig. 18. Improvement of sensitivity in Caltech 40 m prototype interferometer. In order to compare 
LIGO and Mark II, the displacement sensitivity is shown in place of the strain one. Photon shot 
noise level was achieved at higher frequencies, and mirror thermal noise level was considered to 
be achieved, which made the start of the initial LIGO possible. 
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Many new unidentified noises were found during operation of the first-generation 
detectors. 


3.1.5. ISAS 10m and 100 m delay-line interferometer 


Before TAMA’! that was started in 1995, the Japanese research community con- 
ducted to choose which type of interferometer should be adopted for a future large- 
scale detector in Japan. The ISAS group introduced a delay-line interferometer and 
constructed a 10m baseline detector® and a 100m one.°? The NAOJ° group con- 
structed a 20m Fabry-Perot Michelson interferometer.8° Although the delay-line 
one had a simpler optical configuration, the main mirror should be larger, which 
means that the elastic resonant mode frequencies are lower and thermal noise in 
the observation frequency band was larger in the end. As stray-light noise had been 
confirmed by the 30m-Garhing interferometer, noise hunting on the 100m-ISAS 
interferometer was braked by up-converted stray-light noise.8? On the other hand, 
the mirror quality of the Fabry—Perot cavity needed to be higher. Considering both 
practice and design, TAMA decided to adopt the same Fabry-Perot Michelson 
interferometer as LIGO® and Virgo.® In parallel with this determination, TAMA 
took a diode pumped NPRO YAG laser (in the next section) as the light source, 
which had been already tested by the 20m Fabry—Perot prototype at NAOJ. When 
this was decided, LIGO kept the original plan of using an argon-ion laser. The 
100 m-ISAS detector adopted an argon-ion laser. 


3.2. Further R&D efforts in the first-generation detectors 


In order to increase the sensitivity, a higher power of laser is required as discussed in 
Sec. 3.1.1. For example, we need a laser power of 10 MW level to obtain a sensitivity 
of h ~ 10-8 using 534nm light in the 500Hz bandwidth. The leading technology 
for this objective is the invention of nonplaner ring oscillator (NPRO) by Byer and 
Kane®? in 1985. For the first-generation interferometers, a stable and single-mode 
laser source was developed by a laser-diode pumped Nd:YAG laser, which produced 
a few W power level. Using this laser-diode pumped Nd:YAG laser, a consistent 
R&D effort has achieved 100 W level high-power lasers, so far.84 In order to increase 
the power up to 100 W, an injection-locking system® or a master oscillator with a 
power amplifier system (MOPA) is needed, where a single-mode 100 W power level 
has been achieved.®® Under this high-power optical situation, all optical parts need 
to endure such a high power level without reducing their performances.®’ It is not 
an exaggeration to describe that the laser-interferometric detector became possible 
after establishing the manufacturing of highly low-loss mirror with the measure- 
ment.®° In the second-generation detectors, the power inside the cavity reaches 
the 1 MW level by utilizing high-power laser and a power recycling technique. 
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In place of increasing the power, an alternative method, power recycling technique, 
was developed. Along with this power recycling, another idea, that of signal recy- 
cling, arose, which induced the resonant side-band extraction method. 


3.2.1. Power recycling 


Since the sensitivity of a laser interferometer is limited by the shot noise, being 
represented in phase noise as 


2hAQ. 


nP’ (55) 


Pnoise = 
the output signal is proportional to the optical power in the interferometer for 
a given phase shift. The phase shift produced by a given mirror displacement is 
increased by taking the multi-path of the beam or using a resonant Fabry—Perot 
cavity. The power recycling technique, invented by Drever®? and confirmed by sev- 
eral experiments”? °* became a standard techniques to further enhance the power 
stored in the interferometers and to lower the impact of shot noise on the sensitiv- 
ity. This idea arose from the condition of the dark-fringe operation of a Michelson 
interferometer, which produces the minimum shot noise at the output port of the 
interferometer. In this condition, if power loss inside the interferometer is negligible, 
all laser power returns back to the laser source, which can be reflected by a mirror in 
phase with the original input light. This situation forms a new Fabry—Perot cavity 
including both optical arms. 

Let us consider that the arms are made of Fabry—Perot cavities. The ratio of 
input power to internal one in a simple Fabry—Perot cavity is given by 


Pint _ Ti = i 
Po [i-VRiR? [1—-/G—A,.—-T)0 — 42)?’ 


where T;, R; and A; are the power transmittance, power reflectivity, and power loss 


(56) 


in each mirror, respectively. The transmittance of the second mirror is included in 
the power loss. The power recycling mirror is regarded as being the first mirror, and 
the mirrors in the interferometer arms are regarded as being a combined second 
mirror. Under this condition, the power ratio in the above equation, Pin:/Po, is 
called the recycling gain Gece. If the reflectivity of the second mirror and the loss 
of the first one are given by Ry and Aj, respectively, the power ratio reaches the 
maximum as 


T, = (1—A1)[1 — Ro(1 — Ay)]. (57) 


Utilizing this simple model, the maximum power of the interferometer can be 
obtained as 
Gmax = Pint Ti 1 


rep Po A= ApS Ads AG AS 


which shows that the maximum gain is 1/(all loss). 


(58) 
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Thanks to recycling, the shot noise can be further reduced by WGyec. The 
gravitational-wave signal arising from the phase difference of two arms goes out 
to the photo-detector port without any affect of the power recycling, except for 
being affected by the storage time of the Fabry—Perot cavity, which can be reme- 
died by the resonant side-band extraction described in the next subsection. 

Note here that: (i) two stages of the Fabry—Perot cavities create couplings of the 
optical modes and (ii) the transmittance of the input mirror (first mirror) should 
be much larger than its optical loss. 

In applications to practical interferometers, plural modulation frequencies need 
to be adopted, and a control signal extraction scheme must be developed. For the 
initial LIGO, a frontal modulation was applied to a table-top Michelson interferom- 
eter with Fabry—Perot cavities.°° Virgo project also adopted the frontal modulation 
method.°® From TAMA project, practical results of that technique are referred in 
Refs. 97 and 98. 


3.2.2. Signal recycling and resonant side-band extraction 


Adding another mirror, M3 to a power-recycled Fabry—Perot Michelson interferom- 
eter, between the beam splitter and the signal output port as shown in Fig. 19, 
creates a possible increase in the sensitivity within specific bandwidth of interfero- 
metric detectors. This signal recycling mirror allows us to realize standard, detuned, 
dual or resonant recycling.8-°9 Meers neatly analyzed the frequency response of 
interferometric gravitational-wave detectors! that can be applied to all optical 
systems with slight extensions, where the interferometer can contain delay-lines 
or Fabry—Perot cavities, whether or not power recycling is used and whether the 
recycling scheme is standard, detuned or dual. The interferometer is arranged so 


M, 
M 
. M 
Mo M, 2 
From laser ( | ) 
BS 
~~ M; 
To detector 


Fig. 19. Adding another mirror, M3, to a power-recycled Fabry—Perot Michelson interferometer 
creates a possible increase of the sensitivity and bandwidth of interferometric detectors. Mo is the 
power recycling mirror and mirror M,; and M2 form a Fabry—Perot cavity. 
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that, when light beams from the two arms meet at the beam-splitter, the original 
laser frequency heads back to the mirror Mo. On the other hand, any side-bands 
produced by differential phase modulation travel towards mirror M3 and the output 
of the interferometer. The mirrors Mg and Mg will have different relative positions 
and reflectivity, which should be taken into account when calculating how both the 
laser light and side-bands resonate, which means that the laser frequency and the 
side-band frequency experience different reflectivity and optical lengths. 

Dual recycling was experimentally demonstrated with fixed mirrors!®! and with 
suspended mirrors later.!° 

In this signal recycling, the bandwidth for gravitational waves is limited by the 
photon storage time in the combined cavity. To escape from this limit, a resonant 
side-band extraction (RSE) configuration is proposed in the interferometer adopting 
Fabry-Perot cavites.'°? This configuration puts a signal-extraction cavity that is 
combined cavities formed by the mirror M3 and the arm cavity to be in resonance 
for side-band frequencies (different from signal recycling where the signal recycling 
cavity does not resonate). The bandwidth can be adjusted so as to remain wide even 
if the finesse of the Fabry—Perot cavity is increased to improve the sensitivity. Since 
the signal recycling mirror introduces another degree of freedom, the length control 
of the interferometer becomes more complicated, which required more sophisticated 


104 


sensing and control scheme.*’* The resonant side-band extraction configuration is 


applied to second-generation interferometers. 


3.3. Fighting with thermal noise of the second stage 


In designing the first-generation interferometers, thermal noise is considered to dic- 
tate the mechanical vibration of the optical mirror and suspension system. Ther- 
mal noise is one of such fundamental noises. These fundamental noise sources are 
schematically shown in Fig. 20 with an inset figure showing how much frequency 
band each noise dominates. They are: 


(i) seismic noise 
(ii) thermal noise 
(iii) shot noise. 


Newtonian gravity gradient noise, coating thermal noise and radiation pressure 
noise will be considered when we discuss about the sensitivity limit of the second- 
generation detectors. At the early stage of the first-generation detector era, coating 
thermal noise was considered to be smaller than the that of mirror substrate. How- 
ever, it was recognized that it was not true by simulations and measurements.!°° 
Needless to say, the sensitivity of an interferometer is determined not only by these 
noise sources, but also by noise sources arising from cross-coupling between unsup- 
pressed technical noise with any imperfection of the interferometer optical system. 
For example, noise is induced by laser power fluctuations via any absorption asym- 


metry,!°° and higher frequency noise is induced by any relatively large vibration of 
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Fig. 20. Summary of fundamental noises that limit the sensitivity of a first-generation interfer- 
ometer. Newtonian gravity gradient noise, coating thermal noise and radiation pressure noise will 
appear in the discussion of second-generation detectors. 


the suspension pendulum due to seismic noise.!°’ Also, mirror tilts may couple with 
Earth’s gravitational field, which may induce fluctuations of the baseline of the arm 
length.!°® Also, many more noise sources of this kind exist, and unidentified ones 
will be born according to removing and/or reducing those noises in the future. 


3.3.1. Mirror and suspension thermal noise 


Thermal fluctuation arises from the dissipative mechanism of a material,!°® which 
limits the sensitivity of interferometers. Thermal vibrations that affect the optical 
path length of the laser beam are: 


(i) internal elastic vibration mode 
(ii) pendulum swing mode 
(iii) violin mode of the suspension wire. 


The thermal noise of a coating material is considered in the next subsection. The 
effect of the third item is reduced in terms of the optical path direction by the ratio 
of the “mass of the wire” (at most 0.1g) and the “mass of the mirror” (more than 
10kg). 

The thermal noise power spectrum due to mechanical modes of mirrors can be 
approximated by summing the contribution of all the internal modes of each test 
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mass as 


4kpT 2 brn 
TON = 4 Da are fs eee (59) 


This is correct in the case where the damping ¢, of each mode does not depend 
on frequency and assuming the modes are normal.!!° Several experiments gave 
null-dependence, even over a wide range of frequency, but in general the general 
frequency independence has not been demonstrated. Indeed, thermal noise caused 
by some inhomogeneously distributed loss should be considered to evaluate the 
overall effect of a compound system such as a mirror with suspension and an additive 
actuator part.!!! 

In place of the mode-expansion method, a direct method was applied using 
Green’s function by Nakagawa,'! and also presented by Levin.!!? This method is 
a standard one for thermal noise estimation. 

Let us consider, as an example of the lowest mechanical internal mode term 
in Eq. (59), coupled to the beam, of a test mass with the size of TAMA mirrors 
(wo/27 = 51 kHz, ¢ = 10~° if cooled down to 20K, one has 


- T 1/2 @ 1/2 36k 1/2 
(jz(w)[2) = 6.9 x 1077! (sax) (<2) (=) 


1/2 
‘ (aan (axe) sn //Eiz, (60) 


Ww Wo 


Higher modes of the cylindrical body were analysed and confirmed by experi- 
ment,!!4 from which non-negligible contribution to the thermal noise was shown.!!° 
If all higher-mode contributions are added, the amplitude becomes three-times 
larger than the lowest-order value, which was estimated for TAMA sized mirror.!'6 
There are two mirrors in each arm of the Fabry—Perot cavity, and there is no corre- 
lation among those mirrors. Therefore, the total thermal noise is obtained by taking 
the squared sum, which is about 6-times larger in amplitude than the above value, 
when considering higher mode contributions. 

In respect with the thermal noise of the pendulum mode, the damping factor 
that has a relation with dissipation needs to be found. Considering the pendulum 
frequency, since that is typically rather low (e.g. 0.6—0.8 Hz), it is difficult to exper- 
imentally find the factor ¢ of the pendulum. It can be estimated by a measurement 
of the factor of the violin mode of the wire.!!” 

The structure damping is given by F = —k[1 + id|a, where k is the spring 
constant and x is the displacement. If ¢ < 1, the above equation is represented by 
F = —ke’?x. This suggests that the displacement is retarded with respect to the 
force by the phase ¢ on an imaginary space. If = xge*”° holds, the time average 
of F'% becomes 


(F-&) = Inofka2 = 2nof E. (61) 
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This means that the energy relatively dissipates the vibration energy by 27¢ in 
one cycle of the vibration. In the case of the pendulum, since the recovering force 
has no dissipation owing to gravity, dissipation occurs only in the deformation of 
the wire. Accordingly, the damping of the pendulum system is given using the wire 
damping, dy, by!!® 


27 Ppend(Lgrav =F Ewire) —= 2Tobw Ewire- (62) 
That is 
Ewire Kel 
end = Pw ™~ Ow ; 63 
% ' é Esrav + E\wire 6 Koray ( ) 


where key = nV TEI /2¢? with a number of wires, n, and the wire tension, 7’, that 
suspends the mirror. Here, EF is Young’s modulus and I is the second moment of 
the cross area of the wire. The damping factor is rewritten by 


nv TEI 


Ppend = Or ak (64) 


Since T ~ mg/n, the coefficient of ¢,,, called dilution factor is of the order of a few 
thousands. 

Finally, the thermal noise of the pendulum is calculated!!® and the magnitude 
is in a TAMA-size sapphire mirror suspension at a temperature of 20K, 


T \w2 a 4 1/2 
2) — 3 1 —22 0 pend 
eos (sax) (= x Li) (Se 


6ke\\/? (22 x 100Hz\*”? 


m WwW 


where the damping factor of the wire is the measured value of the violin mode, 
assuming no dependence of the frequency. 


3.3.2. Thermal noise of optical coating 


Mirrors have optical coatings for controlling their reflectivity and transmittance. 
Commonly, the dielectric coating consists of alternating layers of SiOz (silica) and 
Ta2Os (tantala), the latter of which is substantially larger.!°° Although the mechan- 
ical loss of the substrate is quite low (~ 1078), that of the coating is relatively high, 
such as 10~3-10~+. However, it had been considered that since the thickness of the 
coating was quite thinner than that of the substrate, the thermal noise arising from 
this coating was negligible for the first-generation detectors, which is not correct as 
stated in the beginning of this section. The mechanical loss of the coating partly 
arises from thermoelastic damping, where the coating and the substrate have dif- 
ferent thermal properties.!?° 

The mechanical loss, ¢, is related to the quality factor, Q, which is 6 = 1/Q. 
If all other noises than that of the substrate and coating can be neglected, total is 
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equal to the sum of the intrinsic loss of the substrate plus any loss associated with 
the coating, 


Uc. 
Ptotal = ds ar U,° (66) 


where subscripts “c” and “s” denote the coating and substrate, respectively. U; is 
the energy stored in the coating and U, is that in the substrate. The term ¢, includes 
losses in the coating materials, in the coating interfaces and in the coating-substrate 
interface. ¢. is the sum of the residual loss of the coating and a thermoelastic term, 
bc = dresidual + ¢th- ¢th is given by!?° 


2C rT 1 Ea 1 Esas 
Cr \l-v ee Cyl — vs 


—— 
(; St 


where g(f) is a frequency dependent term; EC’ are the Young’s modulus and heat 
capacity, respectively. a is coefficient of thermal expansion, and v is Poisson’s ratio. 

The coating thermal-noise is the most dangerous noise in both advanced LIGO 
and Virgo detectors. A study to replace the coating material by lower loss mate- 
121,122 is titania-doped Ta2Os, which produced 
a mechanical loss of 2 x 107+ with optical absorption less than 0.5 ppm. This is 
promising. As for a competitive replacement, Nb2O5 may be promising due to its 
low mechanical loss, but higher optical absorption. Another approach in Italy gave 
a promising result using a nano-layer coating with appropriate annealing,!?? which 
agrees well with the mixture formulas for the complex bulk and shear-loss angles 
in connection with a new coating noise model.!*4 

Since KAGRA interferometer adopts a cryogenic mirror, the coating thermal 
noise is less effective compared with a room-temperature interferometer. In order 
to confirm this point, the mechanical loss of a silica/tantala coating on a sapphire 
substrate was measured at cryogenic temperature, and was found to be a tempera- 
ture independent mechanical loss, which supports the KAGRA design.!?° However, 
other measurements at Glasgow University gave a conflicting result.!?6!27 This 
apparent conflicting results needs to be clarified, being pursued by an experiment, 
but not yet be clearly solved.!?® 


2 
If): (67) 


rial is urgent. A recent discovery 


3.4. Fighting against quantum noises and squeezing 


Radiation-pressure noise and Newtonian gravity gradient noise appear in the 


second-generation interferometers.!29 13! 


topics are described. 


In this section, quantum noise-related 


3.4.1. Radiation pressure noise 


Radiation-pressure noise arises from random hitting of the test mass mirror by 
amplitude fluctuations on the laser, which behaves as carrier light to readout the 
displacement of the mirror. This is a manifestation of quantum back action. Higher 
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power makes the noise larger. This noise was irrelevant to the first-generation inter- 
ferometer; however, it constrains the sensitivity at lower frequencies of the second- 
generation ones in future. So far, we have seen no experimental report presenting 
any successful observation of the radiation pressure noise by interferometers, includ- 
ing prototype ones. 


3.4.2. Squeezing 


Remarkably, although radiation pressure noise has not been directly observed, 
its studies toward squeezing and standard-quantum-limit related issues brought 
to scheduling the implementation of squeezing strategies in advanced detectors 
(e.g. upgrade of second-generation detectors and third-generation ones). Squeezing 
related to a Fabry—Perot Michelson interferometer is described in this subsection. 
The first motivation of squeezing arises from the necessity of higher light power 
to suppress the photon shot noise. The cavity power of the first-generation detec- 
tors was at the 10kW level, which needs to be increased up to the 1 MW level in a 
second-generation detector. Even for the first-generation detectors, the Fabry—Perot 
cavity should be equipped with a compensation system to cancel out the thermal- 
lensing effect on input mirrors. In addition to this problem, all optical parts need 
to maintain their performances under high-power heat production. Reducing the 
requirements by squeezing will improve as well the sensitivity and the events rate. 

Quantum squeezing and quantum noise belong to quantum optics, and were orig- 
inally formalized by Glauber.!°? A quantized single-mode electromagnetic (light) 
field is represented by phase quadrature and amplitude quadrature. These are non- 
commuting Hermitian operators that obey the uncertainty principle of Heisen- 
berg.!83 By applying this uncertain principle to the momentum measurement of 
the test mass, the standard quantum-limit sensitivity, hsqu, is obtained as 


8h 
sau = V apae we 


where L, M and 2 are the baseline length of the interferometer, the mass of the 
test mass and the angular frequency of the observation band, respectively. Note 
that test-mass quantization has no influence on the output noise.'34 Shot noise is 
associated with the phase quadrature of the input vacuum field, while radiation 
pressure noise is associated with the input amplitude quadrature. The power spec- 
trum of the radiation noise, S;°, is proportional to the cavity power, Io, whereas 
that of the photon shot noise, yet changes according with 1/Jo. The total noise 
of the power spectrum, 5; is the sum of these noises, 


Sp = SP + Spr. (69) 


At the frequency of the optimum sensitivity in the first-generation interferometer, 
where the input laser power is at the few W level, the radiation pressure noise is 
lower. If the power is increased, S;? increases with S#?°t decreasing and at some 
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level, S;? becomes equal to S#'°, that is, $;, = 2S%°t, which cannot be less than 
the noise given by Eq. (68).1°° At this optimization, the input laser power, Jo, 
is estimated by Isqr = mL?y*/(4wo) ~10kW, where wo is the cavity resonance 
angular frequency, m the mirror mass, L the cavity length, and y the cavity reso- 
nance width. The power spectrum of the quantum noises is given at this optimized 
condition (I9 = Isgr) by 


2 
S, = aa [ 4 z\ (70) 
where K = 2¥4/0?(7? + 07). 0 is the angular frequency of the gravitational-wave. 

Squeezing is a technique used to reduce the fluctuation of one of the quadratures 
at the expense of increasing fluctuation of the other quadrature. The first proposal 
was presented by Braginsky, who called it quantum nondemolishing (QND).!°° 
Kimble et al. proposed to convert conventional interferometric detectors to quantum 
nondemolition interferometers by modifying their input and/or output optics with 
the analysis of three kinds of squeezing for designing the future LIGO interferometer 
in 2001.135 They were 


(i) squeezed-input interferometer: squeezed vacuum with frequency dependent 
squeeze angle is injected to the interferometer’s dark port, 
(ii) variational output interferometer: homodyne detection with frequency depen- 
dent homodyne phase is performed on the output light, 
(iii) squeezed variational interferometer: squeezed input and frequency dependent 
homodyne output. 


Here, the noise spectrum evaluation was conducted assuming a lossless cavity per- 
formance of the interferometer. 

The techniques for preparing vacuum squeezing needed to be suitably adapted 
for practical implementation in gravitational-wave interferometers. In the first 
demonstration of the squeezing injection effective in the gravitational-wave fre- 
quency band conducted by a bench-top apparatus, the €) nonlinearity in optical 
media was used to create a squeezed vacuum.!?’ This technique was also used to 
produce a squeezed vacuum for the Caltech 40m prototype interferometer, and 
achieved a squeezing enhancement of 3dB at shot-noise-limited frequencies above 
42kHz.1° Squeezing light was also injected into GEO 600, and achieved 3.7dB 
squeezing,!°° described in the next sub-section. The strain sensitivity of LIGO was 
improved up to 2.15dB by injecting squeezed light at the Hanford site.‘4° Apart 
from the technique utilizing nonlinearity of optical media, Corbit succeeded to 
extract the radiaton-induced squeezing, called “ponderomotive”, generated inside 
an interferometer, which is a result of the coupling between the optical field and 
the mechanical motion of the mirror in a table-top experiment.!4! 

Optical loss essentially dictates that the squeezed vacuum and the control of 
losses is a key to achieve good performance of squeezing. This situation is reviewed 
in Ref. 142. 
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4. Large Scale Projects 


The beginning of constructing large-scale interferometers opened the world of inter- 
ferometric gravitational-wave detectors in the 1990s. The sensitivity improvement 
of cryogenic resonant antennas was blocked by quantum limit, which was hard to be 
escaped from. Contrary to cryogenic bar antennas, the sensitivity of interferometers 
can be simply increased by lengthening the baseline with the help of known and 
available technical improvement. Based on experimental results owing to prototype 
interferometers, practical designs of large scale interferometers were conducted and 
their construction have been started. The first-generation detectors have finished 
their role and renovation to second-generation ones is ongoing, which will end in 
this year. It is suitable for initiating operations of those detectors for the centen- 
nial anniversary of Einstein’s general relativity. In this section, large-scale projects 
around the world are introduced. 


4.1. LIGO project 


LIGO project started in 1994 in order to practically catch gravitational-wave events 
by constructing a pair of 4km baseline-length scale facilities separated by 3030 km, 
in Livingston, Louisiana and in Hanford, Washington,® as shown in Fig. 21. In 
Hanford, two parallel interferometers were installed: a 4km baseline one (H1) and 
a half-sized (2km) one (H2). The design target was to observe neutron star coales- 
cence occurring at Virgo cluster, 20 Mpc away, where the theoretical event rate is 
typically 4x10~3 to 4 x 1074 per year. The first observation, called science run #1 
(S1), was conducted in 2002, and the observation was repeated 6 times until 2010 
(S1-S6), where the project conducted in 2009-2010 was called enhanced LIGO due 
to advanced technologies being applied and tested for the advanced LIGO.!% 

The optical configuration of LIGO was a power-recycled Fabry—Perot Michelson 
interferometer adopting a nonplanar ring oscillator of a Nd:YAG 10 W laser source. 


Fig. 21. LIGO project started in 1994 to construct a pair of 4km baseline length scale facilities for 
laser interferometers separated by 3030km, which were in Livingston, Louisiana and in Hanford, 
Washington. (For color version, see page I-CP11.) 


Source: These pictures are taken from Ref. 72. 
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Fig. 22. During the initial LIGO project, it took about five years to attain the target design 
sensitivity after the installation. However, much more short period is expected to achieve the 
sensitivity of the advanced LIGO, the installation of which is finished in 2015. (For color version, 
see page I-CP12.) 


It took roughly 5 years to reach the target sensitivity of the project design by 2006, 
which was 33 Mpc for neutron-star binary coalescence in its optimum of both the 
direction? and orbital axis configuration (in Fig. 22). A summary of the detector 
performance along with scientific results up to S5 can be found in Ref. 72. Over 
the period of $5, individual duty cycles, which involve the statistical figure-of- 
merit representing the detector’s operation time during the observation period, 
were 78%, 79%, and 67% for H1, H2, and L1, respectively; for double coincidence 
between L1 and H1 or H2, the duty cycle was 60%; for triple coincidence of all three 
detectors it was 54%."? Due to the published result of a data analysis applied to 
data collected in those scientific runs, there is no report concerning the detection 
of gravitational-wave events, so far. Since 2009, the modification of interferometers 
was initiated through the enhanced LIGO for the advanced LIGO project so as 
to assure detection,!? which is completed in 2015, and commissioning accordingly 
starts. 

The strain sensitivity of the advanced LIGO is better by about a factor of 10 
than the initial LIGO over a broad band while lowering the lowest observation 
frequency down to 10 Hz, which is attained by new seismic isolation consisting of 
three-stage anti-vibration, four-stage pendulum design, 40 kg test-mass optics, and 


413 Mpc in average distance, which is all sky and inclination average for coalescence of 1.4 Mo 
mass neutron star binary in SNR=8. 
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180 W laser subsystems. The input laser power has been increased to 125 W from 
7W to 8W, and the light power in the cavity is 800kW. 

In order to reduce thermal lensing in the input test mass, very low absorption 
coatings and a substrate are being pursued and designed. However, compensation 
is required as in the initial LIGO,'*? the experience of which is being applied to 
develop a more efficient compensation system. In its optical configuration, power 
recycling and signal recycling are applied along with changing the shape of the sensi- 
tivity curves for various astrophysical sources. DC-readout technique can eliminate 
electric noise during both modulation and demodulation. It was tested at the Cal- 
tech 40m prototype interferometer which confirmed the expected performance.!“4 
For coping with thermal noise, a larger beam size was designed, and a low-loss opti- 
cal coating was developed. A parametric instability!4>“6 that occurs from mode 
coupling between any acoustic oscillation of the mirror substrate and the optical 
cavity electromagnetic field is one of the concerns in a high-power resonant cavity, 
which can be damped by appropriate optical feedback.'47148 

By these improvements, the advanced LIGO detector can catch an event occur- 
ring at more than 300 Mpc in the optimum configuration.® The interferometer is 
operated under the standard quantum limit at the most sensitive frequency. As 
described in the previous section, vacuum squeezing light was introduced into the 
anti-symmetric port, which obtained a sensitivity improvement.!4° 

Figure 23 shows the progress of the strain sensitivity at Livingston, which sur- 
passes that of any interferometers in the initial LIGO'?! 

LIGO plans to export one of the interferometers in Hanford to India for LIGO- 


India observations.!°° 


4.2. Virgo project 


Under French and Italian collaboration, the construction of the 3 km baseline length 
interferometer of the Virgo project? was completed in 2003 at Cascina near Pisa, 
Italy (Fig. 24). The scientific objective of the Virgo project is to detect gravitational- 
wave events as LIGO project, and the target sensitivity was similar to that of LIGO. 
By 2005, its final configuration was settled, and a scientific data-taking run was 
started in 2007,'°! which ended in 2008 with LIGO after conducting collaborative 
observations (named as VSR1). 

In coincidence with the enhanced LIGO, the upgraded configuration of Virgo+ 
was used for a cooperative observation run, VSR2, with LIGO S6. The optical 
configuration of the Virgo interferometer was the same as that of LIGO, and was 
characterized by its seismic-noise attenuation system (SAS) to achieve attenuation 
of seismic noise at low frequencies, the design of which is shown in Fig. 25.°? Larger 
sensitivity in the low frequency band gives the possibility to set upper limits for 


©120 Mpc in average distance. 
fThe advanced LIGO has initiated its observation run (O1) in September, 2015 and has improved 
the sensitivity by three times compared with S6. 
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Fig. 23. Progress of the strain sensitivity of one of the advanced LIGO interferometer at 
Livingston. 


Figure credit: LIGO Laboratory/LIGO Scientific Collaboration. 


Fig. 24. Virgo interferometer is constructed in the suburb of Pisa, Italy. 


Photo credit: European Gravitational Observatory. 
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Fig. 25. Virgo detector, characterized by its seismic noise attenuation system (SAS) to achieve 
attenuation at low frequencies, especially for continuous waves. 


Source: The figure is taken from Ref. 152. 


signals coming from continuous sources as the Vela and Crab pulsars. The sensitivity 
improvement was similar to that of LIGO, and it took about four years to reach the 
level of the design sensitivity, and further improvement was applied during as shown 
in Fig. 26. In 2007, collaborations of LIGO and Virgo exchanged a memorandum 
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Fig. 26. Sensitivity improvement of Virgo. The sensitivity is inferior to that of LIGO at around 
mid frequencies. However, it is much better than LIGO at lower frequencies. (For color version, 
see page I-CP12.) 


Source: The figure is taken from a paper after VSR2 (see Ref. 153). 


of agreement to conduct a cooperative observation run, which will be effective to 
promote the research of gravitational-wave astronomy.’ During this cooperative 
observation (S6 in LIGO and VSR1 and VSR2 in Virgo), the goal sensitivity was 
mostly achieved as shown in Fig. 26.'5° The sensitivity is inferior to that of LIGO 
at around the mid frequencies, while it was significantly better than LIGO at lower 
frequencies of less than 40 Hz, the cross frequency of which increased up to 70 Hz 
by the end of VSR2(S6) in 2010. The observable optimum distance was 15 Mpc 
for neutron-star binary coalescence during VSR4 (2011), where an analysis of fast 
spin-down young pulsars was published.'*4 

If an earthquake greater than magnitude 6 occurred on Earth, the interferometer 
lost the state of locking, irrelevantly to where it occurred. The performance of this 
SAS system was enhanced during VSR1 from several times of losing locks per week 
to less than one per week by applying adaptive control. The Virgo interferometer 
was heavily affected by the environment, such as strong winds, sea waves, and 
earthquakes. Actually, this effect was operated at low frequency, even if this is 
implicitly assessed, from the text it appears that the cavity-lock and the operation 
were fragile, while, with respect to LIGO, that was exactly the opposite. Virgo was 
incredibly more robust. Clearly, at low frequency there was a significant influence of 
wind sources and micro-seismic noise as the weather conditions were not good, but 
those effects were unobservable with LIGO. Nowadays, it is going to be different 
because the advanced LIGO has a very good attenuation system. The earthquake 
magnitude 6 occurring on the other side of the world was effective to unlock cavities. 
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Virgo loosed the lock just once per week in average due to earthquake. That was 
actually an impressive stability, at the times with respect to LIGO. 

Reflecting relatively large seismic noise, scatter light, which is regarded as being 
the major source of noise at mid frequencies, had to be reduced to reach the design 
sensitivity. 

As in the advanced LIGO, an upgrade to the advanced Virgo began in December, 
2011, and is now ongoing.'?! It adopts a dual recycling scheme: ordinary power 
recycling and signal recycling along with a parameter modification. The test mass 
is increased to 42kg. A laser beam is supplied by utilizing a fiber laser with a 
power of 200 W level in its final stage. At the beginning (first year), the Virgo laser 
will be used, which is capable of providing up to 60 W. Since the optical power 
is greater, the optical parts have to be compliant with a 10-fold increase in the 
optical power. A DC-readout scheme is applied as the advanced LIGO. A larger 
spot size on the mirror is designed to reduce the thermal noise. Any thermal-lens 
effect is compensated by a sophisticated thermal compensation system, as in the 
advanced LIGO, which is based on off-axis Hartmann wave-front sensors!®? and 
a phase camera. New diaphragm baffles are installed to suppress any stray light 
merging into the main beams of the interferometer. The payloads and vibration 
isolation will be upgraded. Advanced Virgo is scheduled to have three different 
operation modes: 


(i) power recycling, 25 W, 
(ii) dual recycled, 125 W, tuned signal recycling, 
(iii) dual recycled, 125 W, detuned signal recycling, 


where the detuned signal recycling is chosen to optimize the BNS inspiral range. 
They can be considered as benchmark configurations for a reasonable step-by-step 
approach when facing increasing complexity. 

The installation began by the end of 2013, and the revised interferometer will 
be handed to researchers within 2015.!4 


4.3. GEO project 


GEO 600 is a project under collaboration between German (Albert Einstein Insti- 
tute, AEI) and British (University of Glasgow, Cardiff University, etc.) researchers. 
The interferometer is placed near Hannover, Germany (Fig. 27).!° The optical lay- 
out of GEO 600 is a simple Micheleson interferometer of baseline length, 600m, 
with a folded optical path for each arm and equipped with two additional mirrors: 
a power recycling mirror and a signal recycling mirror, which form a dual recycled 
optical scheme to enhance the sensitivity.°° 

The GEO detector combines a feature of an observing instrument and that of 
prototype for new technologies “Advanced techniques” of GEO, which are: 


e Monolithic suspensions (Refs. 76 and 157) 
e Electro-static actuators 
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Fig. 27. GEO 600 interferometer is placed near Hannover, which surrounded by vine field. 


Photo credit: AEI/GEO600. 


e Signal recycling 
e Squeezing (since 2010). 


All the above upper three items are applied to the design of the second-generation 
detectors that are now under construction. 

Figure 28 shows the noise projection of the interferometer!® to find each noise 
source linearly coupled to the observed noise spectrum, first proposed by Hild 
et al.1°° The plots in the figure correspond to data taken at the $5 LSC science run. 

With respect to squeezing, squeezed vacuum-state light was introduced from its 
signal port so as to reduce the shot noise of GEO 600. 3.5dB improvement was 
achieved in 2010 in the shot-noise limited frequency band. This is equivalent to 
about 3-times enhancement of detectable sources. This technique was applied for 
a long duration of time; a 2.0dB improvement on the time average was obtained 
(Fig. 29).18° In 2013, 3.7dB improvement has been achieved, which is 3.7 times 
enhancement from the point of view of event rate.!© 

GEO 600 partially joined the collaborative observation with LIGO and Virgo 
during both $6 and VSR4. Researchers of GEO closely collaborate with LIGO peo- 
ple. Glasgow researchers developed a monolithic suspension system for the advanced 
LIGO and Virgo researchers developed a variant of these for their advanced detec- 
tor. AEI supported by supplying the advanced LIGO with a high-power laser 
system. 

GEO was upgraded to GEO-HF (optimized in the high frequency range). Its 
commissioning was conducted in 2011 by utilizing a squeezing technique through 
an automatic alignment control with achieving 4+ dB, where the noise was reduced 
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Fig. 28. Noise projection of various noise sources of GEO 600 taken at the 55 LSC science run. 
Shot noise, feedback noise, magnetic noise, laser amplitude noise, oscillator phase noise, RF noise 
and so on are plotted in order to find unidentified noise sources. There is a discrepancy between 
the un-correlated sum of all noise projections and the observed sensitivity curve. 


Source: This figure is taken from Ref. 158. 
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Fig. 29. Squeezed vacuum state light is introduced from its signal port in order to reduce the 
shot noise of GEO 600. A 3.7dB improvement was achieved in the shot-noise limited frequency 
band in 2013. 
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in intermediate frequencies. The power of the main laser was increased to 30 W; 
further, a new inner mirror compensation system was installed and also a thermal 
compensation scheme was applied to the beam-splitter. The operation of GEO-HF 
is during the term of installing both the advanced LIGO and the advanced Virgo 
(probably until 2017). The duty cycle of the interferometer, which was 80-90%, so 
far will be improved by the advancing GEO-HF project. 


4.4, TAMA/CLIO/LCGT(KAGRA) project 


In the early 1990s, researchers in Japan had been separately conducting R&D effort 
by utilizing the 20 m-Fabry—Perot interferometer of NAOJ and the 100 m-delay-line 
interferometer of ISAS, as discussed in Sec. 3.1.5. After this period, researchers 
formed a kind of consortium in order to develop a larger scale interferometric 
gravitational-wave detector in Japan. This was not easy from the point of view 
of funding to obtain a large amount to construct such a km-scale interferometer 
in Japan. In Japan, a conservative approach was adopted: acquiring and testing 
the know-how on a given scaled prototype and rescale the design through steps 
by one order of magnitude in arm length. In spite of the quite effective approach, 
driving the Japanese researchers at the top of the expertise in this field, it was not 
straightforward to promote at national funding agencies the poject of a km-scale 
interferometer. Since a 10m-scale prototype had been tested, the next one should 
be a 300m scale one. Therefore, researchers took TAMA project as the next step. 
When TAMA began to be constructed in 1995, the site construction of the initial 
LIGO had already started. Researchers had to conduct both the construction of 
TAMA and the design of the km-scale interferometer project. It was not easy to 
persuade the funding agency to approve big funding without any fruitful result of 
TAMA. Large seismic noise did not permit to reach the design sensitivity within the 
funded term (5 years + extended 2 years). TAMA introduced the so-called TAMA- 
SAS to reduce any large ground vibration at low frequencies, which improved the 
low frequency performance, but, also, enlightened the need of more crucial choices to 
be considered for the km-scale interferometer. In these struggles, researchers modi- 
fied the design of km-scale that would be more appealing, and would accentuate its 
distinctive features among gravitational-wave projects in the world and proposed 
the Large scale Cryogenic Gravitational wave Telescope (LCGT), which adopted 
an underground facility and cryogenics. Utilizing both the TAMA and CLIO inter- 
ferometers, various R&D projects were conducted, leading to the second-generation 
interferometer techniques necessary for LCGT(now KAGRA). 


4.4.1. TAMA 


TAMA has a 300 m baseline length Fabry—Perot Michelson interferometer placed at 
the Mitaka campus of the National Astronomical Observatory, Japan (NAOJ) with 
power recycling. It achieved the best sensitivity and longest observation run earlier 
than any other long-baseline interferometers by 2000.1! The optical configuration 
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Fig. 30. TAMA vacuum tubes are placed half underground (upper left). An input mode cleaner 
vacuum tube, 10m long, is used (lower left). Test mass is suspended by double pendulum system 
on an isolation stack (right). 


of TAMA is a power-recycled Michelson interferometer with a Fabry—Perot cavity 
in each arm that is similar to those of LIGO and Virgo. The laser beam produced 
by a 10W Nd:YAG laser was fed through a 10m length ring mode-cleaner cavity. 
The vacuum tubes, 40cm in diameter, are placed half underground (3m in depth) 
as shown in the upper left of Fig. 30. A test mass mirror, 10cm in diameter, 
6cm in length and weighing 1 kg, was suspended from an intermediate mass that 
was suspended by a control platform fixed through four stems on the last stage 
of the vibration isolation stack, consisting of three stages in the vacuum chamber. 
The vibration isolation was augmented by an additional active isolator between the 
legs and the floor under the vacuum chamber. It took two years to encounter the 
deadlock of the noise spectrum, as shown in Fig. 31, which was the world-best 
sensitivity, realized by an interferometric gravitational-wave detector.!! 

The lower-frequency noise arises from the fact that Mitaka is in the Kanto 
area, a large part of which was formed during ancient times by volcano ash from 
Mount Fuji. TAMA researchers tried to reduce the effect of the large seismic noise 
by installing SAS, which was originally developed at the Virgo project.!6!:1®? The 
improvement is shown in Fig. 32. 

Shot-noise sensitivity in higher frequencies was achieved immediately after its 
construction was finished, but the sensitivity at lower frequencies was largely 
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Fig. 31. Sensitivity improvement of TAMA just after installation until achieving the world record 
at that time, where the initial LIGO began its commissioning one year later. 
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Fig. 32. Vibration noise at low frequencies was effectively reduced by the installation of 
TAMA-SAS. 
Source: This figure is taken from Ref. 161. 


dictated by mechanical noises originating from seismic noise, such as Barkhausen 
noise!” in the actuator bar magnet and/or stray light-scattered noise possibly due 
to the large amplitude of the mirror suspension pendulum. Figure 33 shows the 
achieved sensitivity of TAMA and also shows that the major limit arose from a 


1-558 K. Kuroda 


— dL- sensitivity 


—— Intensity noise 

—— Alignment noise 
— dl- feedback noise 
—— dl+ feedback noise 
—— Shot noise 

—— Detector noise 

—— Det+Shot 

—— Frequency noise (err) 
—— Upconversion noise 


412 
] 


Displacement noise [m/Hz 


eal 


l 

ANS 
le 
i 


Frequency [Hz] 


Fig. 33. Achieved sensitivity spectrum of TAMA at the end of the term under the TAMA project. 
The design sensitivity was attained at higher than 800 Hz, where the shot noise limited the spec- 
trum, and obtained noise spectrum had a discrepancy at lower frequencies. The curve of up- 
conversion was empirically determined by knowing the linear dependence of the noise spectrum 
against the current of the coil for test-mass actuation. 


possible up-conversion noise, where the noise curves were measured under the con- 
ditions of increased currents of the coil actuator for length control and estimated 
by a typical current needed for normal operation of the interferometer. 

The noise caused by the Barkhausen effect in the bar magnet used for mirror 
actuation was first found in the initial LIGO; the large amplitude of the pendulum 
motion induced up-converted broad-band noise at frequencies from of 10 Hz to a 
few 100Hz, which still remained in the noise spectrum of the final stage of the 
initial LIGO.” 

The effect of large seismic noise on the interferometer was experimentally clar- 
ified by the 20m prototype moved from Mitaka campus to Kamioka mine. The 
sensitivity improvement is compared in Fig. 34, which is surprisingly large. The 
optical configuration of the interferometer was a locked Fabry—Perot cavity inter- 
ferometer. Also, the suspension of the test mass was simply a single pendulum, the 
support frame of which was fixed on an optical table in the vacuum chamber. The 


stability of the interferometer placed underground was reported to be good.!® 


4.4.2. CLIO 


CLIO is a 100 m baseline-length cryogenic locked Fabry—Perot interferometer placed 
underground at Kamioka mine (Fig. 35). A thermally limited sensitivity was 
achieved in 2009 by cooling the mirrors down to 10 K.'6* Until reaching this result, 
a series of key technical developments are described in this subsection. 


Ground-based gravitational-wave detectors 1-559 


F 4 Mitaka 


= 

5 
oe 
o 


= 
a 


10° 


= 

2 
= 
N 


Square root of noise power [m/VHz] 


= 

5 
to 
ao 


2 3.94 56789 2 3 4 56789 
10 100 1000 


Frequency [Hz} 


Fig. 34. Effect of a large seismic noise on the interferometer, experimentally clarified by the 
20m prototype moved from Mitaka campus to Kamioka mine. The optical configuration of the 
interferometer was a locked Fabry—Perot type and the suspension of the test mass was simply 
a single pendulum, the support frame of which was fixed on an optical table inside the vacuum 
chamber. 


Fig. 35. An end cryostat of CLIO placed underground at Kamioka mine. Thermal noise at 
cryogenic temperature, 10 K, was achieved in 2009. (For color version, see page I-CP13.) 


Considering the sensitivity improvement of the first-generation laser interfer- 
ometer, both LIGO and Virgo chose the way not to adopt any cryogenic mirror. 
The financial reason why LCGT had to choose cryogenics was touched upon at the 
beginning of this section. Using cooling mirrors is a direct way to reduce thermal 
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noise, unless its mechanical loss increases at cryogenic temperature. Since fused 
silica, which is widely used in an interferometer operating at room temperature, 
has a higher mechanical loss if it is cooled, a sapphire crystal is chosen in place 
of fused silica for the substrate of the cryogenic mirror. Sapphire crystal has high 
mechanical Q and extremely high thermal conductivity at cryogenic temperature. 
Since it has greater optical loss, how to extract heat produced inside the substrate 
is serious concern. In the designing process of the whole payload structure, we rec- 
ognized no-need to worry about thermal lensing efefct at early stage, which is the 
main concern of room temperature operated system. 

Since it is not realistic to cool down the whole interferometer, including the beam 
tubes extending to km-scale, only the test-mass mirror is cooled, as is schematically 
shown with a suspension subsystem in Fig. 36. The problem of how to cool it is easy 
to be answered. Since the mirror produced high power, even if a low-loss absorption 
substrate and optical coating were developed, the heat must be efficiently extracted. 
We have no way other than using a heat conductor attached directly to the mirror. 
It is the suspension fiber. The second question is how much of an improvement 
is expected by cooling the mirror, which depends on the mechanical loss change 
according to lowering the temperature. The third one is how to block thermal 
radiation incoming from a beam tube that is placed at room temperature. 

Cryogenic cooling tests of sapphire mirrors were conducted to address the first 
question. They confirmed that suspension fibers with reasonable thickness were able 
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Fig. 36. Schematic structure of the cryogenic mirror with the suspension system. The heat pro- 
duced in the mirror is extracted through heat conductors. The reaction mass is not for CLIO, but 
for LCGT(KAGRA). 
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to extract enough heat.!® This is due to a rapid increase of thermal conductivity 
by four orders at the cryogenic temperature, where the fiber is made of pure crys- 
tal sapphire. However, the thermal conductivity is affected by the diameter of the 
fiber due to phonon scattering.'®° Regarding the second question, the mechani- 
cal quality factors of the mirror substrate, suspension fiber, and optical coating 
were measured, and confirmed the improvement by cooling the mirror with the 
125,167,168 Ty relation to the third question, a conduction effect 
of thermal radiation in a metal shield pipe in a cryostat was studied, and it was 
significant reduction of the heat flow by radiation baffles with their appropriate 
arrangement was discovered.!®? Even if heat is suppressed, contamination residual 


suspension system. 


gas may degrade the mirror quality. The contamination speed was measured sim- 
ulating practical arrangement of the high finesse cavity mirror and found that the 
effect can be controlled.t” 

There was no measurement concerning the optical absorption in the sapphire 
material around a wavelength of 1 ym, which is expected to be used. The measure- 
ment was made to check the optical quality of the sapphire substrate at cryogenic 
temperature. Sapphire crystal was one of the candidates of the advanced LIGO 
optics; however, it was dropped due to an unreliable production quality.'? The sap- 
phire substrates were all produced by Crystal System Ltd. in 2001. The measured 
absorption was 90 ppm/cm,!"? 
control is necessary for practical usage at the cryogenic temperature. The worst 
situation will arise due to the absorption of laser power inside the mirror substrate, 
which is thermal lensing. This effect was harmful in both the initial LIGO and 
Virgo, where TCS was applied to compensate for the optical deformation. How- 
ever, the effect is greatly reduced in the cryogenic sapphire substrate owing to the 
huge thermal conductivity.1”? 

By utilizing those experiences and knowledge, the CLIO interferometer was 
designed and developed.!”? By this cryogenic interferometer, direct measurement 
of the thermal fluctuation of high-Q pendulum was obtained.!”4 

Since CLIO was placed underground, valuable knowledge and experiences were 
obtained for the km-scale detector. They include knowledge about the cryo-cooler 
system,!”® the maintenance of mechanical devices, dust preventing techniques, and 
so on. 


which was higher than expected. Significant quality 


4.4.3. LCGT(KAGRA) 


Although LCGT (now, KAGRA) was originally planned in 1999,'°° its funding 


0176 and its construction 


was approved as one of national scientific projects in 201 
started. As all other national projects, a nickname of LCGT project, KAGRA, was 
chosen from submissions from the public in 2012. It is a 3km baseline length power- 
recycled Fabry—Perot Michelson interferometer having the RSE configuration with 
cryogenic mirrors, and is placed underground at Kamioka in Gifu prefecture, which 


is perspectively shown in Fig. 37. 
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Fig. 37. KAGRA is a 3km baseline length power-recycled Fabry—Perot Michelson interferometer 
having RSE configuration with cryogenic mirrors, and is placed underground at Kamioka in Gifu 
prefecture. (For color version, see page I-CP13.) 


The KAGRA interferometer is placed underground deeper than the surface of 
the mountain by more than 200m. Also, a 500m long horizontal access tunnel to 
the center area is dug from the surface entrance. The design of the center area is 
shown in Fig. 38. 

The rock of the mountain is utilized to form a 2-layer structure for a tall vibra- 
tion isolation (SAS) system above the cryostat housing cryogenic mirror as shown 
in Fig. 39. The top of the isolation system is in 14m high above the floor shown in 
Fig. 40. The design of SAS is based on knowledge of TAMA-SAS. 

The optical configuration of KAGRA is a power-recycled Fabry—Perot Michelson 
interferometer utilizing the RSE scheme, as shown in Fig. 41. The sensitivity design 
was based on the signal recycling.4© A variable RSE with DC read out is planed 
to be installed and its control method is investigated.!”” The limit beyond classical 
noises is quantum noise, which can be conquered, later, by adopting QND squeezing 
strategies. First, a homodyne phase is determined to cancel any photon shot noise 
and radiation—pressure noise. Second, an optical spring effect will be utilized by a 
detuning technique. Figure 42 shows the designed noise power spectrum. DRSE in 
the right-hand side figure is more sensitive at frequencies less than 500 Hz, while 
BRSE in the left-hand side figure is better in higher frequencies. The first detection 
of gravitational wave can be achieved by DRSE, and the details of a merger can be 
detected by BRSE. The most relevant characteristic features are the adoption of a 
cryogenic mirror and that the location is placed underground. The third-generation 
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Fig. 38. The KAGRA cavern is designed to keep more than 200m depth from the mountain 
surface at any places of the interferometer. The figure schematically shows the center area. A 
500 m long horizontal access tunnel to the center area is dug from the surface entrance. 


Fig. 39. Cryogenic mirror suspended inside the cryostat. The concept of cooling system is 
described in relation to Fig. 36. 
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Fig. 40. Design of SAS, based on the knowledge of TAMA-SAS. A series of GAS filter stages 
is supported from the top housing, which is fixed on the second floor caved in the mountain 
rock. 


detector, Einstein telescope (ET), planed in EU countries, adopts both underground 
location and cryogenics. Since other second-generation interferometers are placed 
on the ground surface; and do not use of cryogenics, KAGRA is sometimes called 
as a second-half generation detector. The technical achievements detained for this 
KAGRA project are based on cryogenic mirror development conducted in the CLIO 
project and cryogenic experiments done for CLIO. The practical design of the cryo- 
genic payload is being conducted under EU-Japan research collaboration supported 
by ELiTES.!”8 The techniques developed in KAGRA will contribute to the advance- 
ment of gravitational-wave physics. 
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Fig. 41. Optical configuration of KAGRA, which is a 3 km baseline length Fabry—Perot Michelson 
interferometer with power recycling utilizing the RSE scheme. 
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Fig. 42. Design sensitivity of KAGRA. DRSE, shown in the right-hand side figure is more sensi- 
tive at frequencies of less than 500 Hz, while BRSE in the left-hand side figure is better at higher 
frequencies. 


4.4.4. Einstein telescope 


Advanced gravitational-wave detectors will soon succeed in the first detection of 
gravitational wave. However, since the detection rate will be a few in a year, it 
is not sufficient to open the era of precision gravitational-wave astronomy. Higher 
detection rate can be achieved by enhancing the sensitivity, which makes SNR better 
for closer gravitational-wave sources. ET was planned to achieve this requirement 
of 10-fold sensitivity improvement (in Fig. 43). This will be realized by an inter- 
ferometer of 10km baseline length with cryogenic mirrors, placed underground as 
schematically shown in Fig. 44),!” the design of which was disclosed in 2011 being 
financially supported as design study by EU committee. 
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Fig. 43. The sensitivity of ET is designed to achieve sensitivity improvement by a factor of 10. 


Fig. 44. Einstein telescope planed in EU countries. 10 km long triangular shaped interferometer 
arms are placed underground and main mirrors are cooled down to cryogenic temperature. The 
artistic view is taken from the ET design document. 


5. Summary 


Regardless of a 60-years effort to improve the sensitivity of gravitational-wave detec- 
tors, the detection of a gravitational-wave event has not yet succeeded. The author 
had hopefully expected to have a report of the detection of a gravitational-wave 
when cryogenic resonant antennae achieved their design sensitivities in the late 
1980s. However, no detection had been achieved and the theoretical bound receded 
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further. Since then almost 30 years have passed. In this centennial anniversary year 
of the birth of Einstein’s general relativity, the advanced LIGO should start opera- 
tion, and in a few years all second-generation interferometers should begin observa- 
tions for the practical detection of gravitational-waves possibly under a world-wide 
observation network. The author strongly believe that one can hear the report of 
the first detection of a gravitational-wave in a few years. 
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Appendix A. Thermal Noise 


After J. Weber it is not an exaggeration to describe that thermal noise has been 
the most difficult phenomenon to be conquered among experimentalists for aiming 
to detect gravitational-waves. Thermal noise prevails everywhere; not only in the 
antenna body itself, but also in the transducer in the case of resonant antennae, 
and not only in the mirror, itself, but also in the optical coating material in the 
case of laser interferometers. Thermal noise dictates the ultimate performance of 
all electronic devices. 


A.1. Nyquist theorem 


A fluctuating voltage appears between two poles of an electric resistance, R, which 
is thermally in an equilibrium state. Assuming temperature 7’, the mean square of 
voltage V is given using the Boltzmann constant, kp: 


V? = 4RkpTAf, (A.1) 


where Af is the frequency bandwidth where the voltage is measured. 

The power spectrum of random processes is described by the power spectrum 
density, G(f), which is defined by an ensemble average of the time average of the 
power consumption occurring in a unit resistance per unit frequency bandwidth. 
Taking the frequency bandwidth between adjacent frequencies as 

n+1 n 1 
Afn = = =—, A.2 
fn Fatt fn i iT. i ( ) 
an ensemble average of the time average of the power consumption, (P,,), is repre- 
sented by 


Gian = Fa =v. (A.3) 
where V,, is the voltage in the frequency bandwidth and G(f) = 4RkpT. 
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Let us consider a function system, where an output O is produced responding 
to an input J. The input is something like a force, voltage and/or current, and 
the corresponding output is a displacement, velocity and/or current. The function 
is a linear system characterized by an imaginary response function, Z(f), that 
produces an output of Z(f)A(f)e* for an input of A(f)e’”’. Assuming that the 
input fluctuates in time 7, as given by 


= / A(fe?"F*df, (A.4) 
the fluctuation of the output is calculated by 
ott) = f z(naear. (A) 
Since the power spectrum of the input G;(f) is given by 
Jim IAP, (A.6) 


the power spectrum of the output Go(f) becomes 


Jim ZIZ(NAW)P. (A.7) 
That is, 
Go(f) = |Z(f)/?Gi(f), (A.8) 


oa) Go(fdf = [iz NPG Aap. (A.9) 


This formula gives a method to evaluate the response of the fluctuated quantity in 
a linear system that has been affected by a noise source. 


A.2. Thermal noise of a harmonic oscillator 


Let us consider a harmonic oscillator of mass m and spring constant ksp = mwé 
with damping 8 = m/7o. To the oscillator is applied a fluctuating force with a 
power spectrum irrelevant to the eigen-mode frequency, wo. From the applied force, 
Ae?™'ft, the behavior of the oscillator is determined by 


mit + Ba + kepx = Ae?™F*, (A.10) 
Since the stationary solution of the equation is given by 
A Qnift 
a= c (A.11) 


—4r?mf? + 2iBf + mwZ’ 
the response function of this system is 


Z(f) = : 


—An?mf? + 2niBf + mw?’ 


(A.12) 
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Denoting the input spectrum as Gy, the output mean square is calculated by 


a 1 

P= Gy | 
9 (c—4n?2mf?)? + 4128? f2 
1 


=> Gn 4Bksp . (A.13) 


df, 


We assume here that the damping given by (§ arises due to a statistical- 
fluctuating force independent from any special system. If the system is in a thermal- 
equilibrium state of temperature T, kspZ? = kpT holds. By combining the above 
equations, we obtain 


Gn(f) = 48ksT. (A.14) 


In general, if a dynamical system described by a generalized coordinate, g, has a 
damping of —(q, a fluctuating force, 40kpT, always exists. 

The damping that dictates the behavior of a dynamical system is not velocity, 
but structure damping in the frequency region where mechanical vibration domi- 
418 Tf this is correct, the response function deduced in Eq. (A.12) needs to 
be modified by replacing the spring constant, ks), by an imaginary spring con- 
stant, ksp(1 +i). The spectrum of the fluctuation force is not white, but becomes 
Gn = 4(ksp/w)kpT according to the fluctuation—dissipation theorem. The output 
response to this force is in frequency spectrum, 


_ dkpT ure 
mw |—w? + [1 +idlwe|?’ 


nates. 


|x(w)|? (A.15) 
A large difference from the case of velocity damping is that the magnitude of the 
spectrum at lowering frequencies increases in proportion to f—', and at increasing 
frequencies decreases more rapidly. That is, for w << wo: 


nf) Pa 


2 
Mw 


(A.16) 


Ww 2 
which is useful to calculate the noise power spectrum arising from the thermal noise 
of the mirror substrate in observation—-frequency band and for w > wo, 


AkpT wed 
mu? 


|a(w) |? 


which is useful to evaluate the noise power spectrum from pendulum thermal noise 
in observation—frequency band. 


(A.17) 


Appendix B. Modulation 


In order to make the condition of minimum noise in a Michelson interferometer, 
the output signal becomes null. Even in this condition, a modulation technique is 
applied to extract a non-null signal by placing an optical phase modulator in each 
beam path, as shown in Fig. 45. 
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Fig. 45. Optical phase modulators are equipped to extract signal under the minimum shot-noise 
condition. Electro-optical modulators (EOMs) are set in the beam paths, and applied modulation 
voltages in an alternative phase. This is called as the internal modulation technique. Since the 
optical phase is distorted through the modulator imperfection and there is a limit of light power, 
the internal modulation is not applied to present practical interferometers. 


EOM, and EOMg are driven by counter phases, and the modulation index is 
%. The electric fields of light beams that enter the beam-splitter are combined as: 


Eo y Mm oi Eo ; Mm os 
E= =O 6-1 sin Wmt) 20 6 -i(g2—F sin Wmt) Bl 
; ; , (B.1) 
where $1; = ¢2 is assumed. The photo-current is described using Jp to be 2-times 
the current in the case of m = 0 as: 


I 1 1 1 
— = —— —cosA¢gcos(msinw mt) + = sin Adsin(msinw,,t), (B.2) 
Ip 2 2 2 
which is expanded by a Bessel function as 
I 
I= mall — Jo(m)] + In Ji(m) Ad sin wmt + (higher-harmonics-of — wm). — (B.3) 


The first term is the DC component, Ja-, and the second term is the first-order 
term of the modulation, J,,,,. This current is fed into a demodulator circuit and the 
coefficient of sinw,,t is extracted. The noise current is Imoa = V2V2elac, where 
white noise by Iq, produces two side-band noises at around w,,/2a. The phase 
noise equivalent to this current becomes the minimum detectable phase, which is 


described by 
Ad i Laod _ V1l- Jo(m) 2e (B 4) 
Bere Ji(m)  V Ip’ 


Wm 
Ae 
Note here that the magnitude tends to be equal to the minimum phase obtained 
earlier if m — oo. 
On the other hand, if we take the condition ¢, — ¢2 = 7 + Ad, the operating 
fringe becomes bright, but the signal-to-noise ratio becomes worse when m is taken 
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to be small as described by 


_ Tasa _v 1+ Jo(m) 2e 
Admin an = Ti(m) [z. (B.5) 


Ae 
Because the above Igo is [1 + Jo(m)], I.,, becomes —IpJ;(m)Ad. The above 
explanation is based on the book.!” 


Appendix C. Fabry—Perot Interferometer 
C.1. Fabry-Perot cavity 


Here, the response of the Fabry-Perot cavity to phase-modulated light was analyzed 
and compared with an experiment.18° The Fabry-Perot cavity consists of two high- 
reflection mirrors facing each other at a distance of @ and can trap light inside, as 
shown in Fig. 46. 

Mirrors have high-reflectivity optical coatings on their facing inner surfaces, 
and have anti-reflection coatings on the outside surfaces. When light is reflected 
at the inner surface, the light phase changes due to the reflection by the harder 
refractive material. The phase advancement due to propagation inside the cavity is 
A = Q¢/c, where 2 is the angular frequency of light. Assume r, ¢ to be the amplitude 
reflectivity and the amplitude transmittance (differentiating both mirrors by suffix 
1 and 2). Denoting the amplitude of the input light beam as A;, the amplitude of 
the reflection beam, A, is represented by 


A, = [(iri) + t2(irg)e7 "4 + t?(ir,)(irg)?e~ "4 +---JAj, 


co 
= [in ini i (ir)e~ 74 Sirs)” (ir) 7" Aj, 
n=0 
BGrsje 4 
= jiry + ———_, | Ai. C.1 
1+ ryrge7 2A : ( ) 
ryt r,t, 
d Transmitted 
Input amplitude A; Power store amplitude A, 


Input power /() Transmit power 


jt) 


High reflectivity coating 
Anti-reflection coating 


Fig. 46. Light beam introduced from the left hand side. The outer surface of the mirror has an 
anti-reflection optical coating and the inner surface has high reflectivity due to an optical coating. 
The mirror of the right-hand side has a similar surface treatment. Light is trapped inside the 
cavity, and a tiny amount of light power leaks from the right-hand side mirror. 
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Fig. 47. Transmitted light of the Fabry—Perot cavity in resonance. The frequency gap between 
adjacent resonances is called FSR, which is given by Avpsr = 37. The Finesse, F, is defined by 
the ratio of Avpgr and AQ/2n (width of resonance). 


Also, the amplitude of the transmitted light, A;, is 
A; = [Ate + tyte(ir1)(irg)e7 "4 + tte (ir1)? (irg)?e- 4 “poms Ai, 


—th 


(C.2) 


oo 
i ‘ és 94 tytge 
=tt iA n n 2ind 4. = x 
182 Dr) (ir2)"e a 


When the condition e~?’4 = —1 holds, light is resonantly trapped; the transmitted 


light is drawn in Fig. 47. The gap between the resonances is called FSR, which 
is given by Avrsr = 3; the Finesse, F, is defined by the ratio of Avrgr and 
AQ/2n = Av (width of resonance): 


Avrsr 
a a. (C.3) 
The power of the transmitted light is given by 
I, tise 14+) : = ty te ° 1 (C.4) 
I, |l—riree7 2} 1 arre)] 14 Fsin? 5’ 
where 
4rire QF : 
F= = : C.5 
(1 — rr)? ( T ) ( ) 


This represents a transfer function of a bandpass filter with a Q-value of 20F/X. 
C.2. Frequency response of a Fabry—Perot Michelson 
interferometer 


A Fabry—Perot Michelson interferometer has Fabry—Perot cavities in place of arm 
mirrors on the Michelson interferometer. The frequency response of the Fabry—Perot 
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Michelson interferometer is obtained by applying repeatedly the calculation used 
in the simple Michelson interferometer, which considering amplitude reduction by 
ryrg with a retarded phase of 2A in each light return. The response function is 


Q . 1 
Hrp(w) = — sin( =) i (C.6) 


c 1 — ryrge—2twe/e? 
tire 
where a = 7+. Also, 
= Pars 
. wl 
sin — 
aQ Cc 


|Hrp(w)| = = 


l—ryre : 
( i+ reine 
c 


Appendix D. Newtonian Noise 


where F' is given by Eq. (C.5). 


If the surrounding gravity gradient changes, suspended test masses experience accel- 
eration due to the changes, which causes the so-called Newtonian noise. The source 
of the gravity gradient change is any density change due to some elastic deformation 
of the ground, atmospheric pressure, underground water level and so on. As is easily 
expected, since the period of the vibration is long compared with other noisy dynam- 
ics, it affects the sensitivity at lower than a few Hz in third-generation detectors. 
The first analytical estimation was made by Saulson,!*! where 3x 107° m/ V Hz was 
estimated at 10 Hz by seismic noise. Although it is not harmful in second-generation 
detectors, suppression of the Newtonian noise is a benefit to widen the frequency 
band towards lower frequency, where astrophysical sources are much more abun- 
dant. An experiment to reduce the effect on test masses was conducted by feeding 
back filtered seismic array data.!8? 
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Gravitational Wave (GW) detection in space is aimed at low frequency band (100 nHz— 
100 mHz) and middle frequency band (100 mHz-10 Hz). The science goals are the detec- 
tion of GWs from (i) Supermassive Black Holes; (ii) Extreme-Mass-Ratio Black Hole 
Inspirals; (iii) Intermediate-Mass Black Holes; (iv) Galactic Compact Binaries and (v) 
Relic GW Background. In this paper, we present an overview on the sensitivity, orbit 
design, basic orbit configuration, angular resolution, orbit optimization, deployment, 
time-delay interferometry (TDI) and payload concept of the current proposed GW 
detectors in space under study. The detector proposals under study have arm length 
ranging from 1000 km to 1.3 x 10° km (8.6 AU) including (a) Solar orbiting detectors — 
(ASTROD Astrodynamical Space Test of Relativity using Optical Devices (ASTROD- 
GW) optimized for GW detection), Big Bang Observer (BBO), DECi-hertz Interferom- 
eter GW Observatory (DECIGO), evolved LISA (e-LISA), Laser Interferometer Space 
Antenna (LISA), other LISA-type detectors such as ALIA, TAIJI etc. (in Earthlike solar 
orbits), and Super-ASTROD (in Jupiterlike solar orbits); and (b) Earth orbiting detec- 
tors — ASTROD-EM/LAGRANGE, GADFLI/GEOGRAWI/g-LISA, OMEGA and 
TIANQIN. 


Keywords: Gravitational waves; space gravitational wave detectors; dark energy; galaxy 
co-evolution with black holes; inflation; galactic compact binaries. 


PACS Numbers(s): 04.80.Nn, 04.80.—y, 95.30.Sf, 95.55.Ym, 98.62.Ai, 98.80.Es 


1. Introduction 


Gravitational Wave (GW) detection has been a focused research subject for some 
time. With the announcement of LIGO direct GW detection,!? we are fully ush- 
ered into the age of GW astronomy. Second-generation ground-based interferome- 
ters are being upgraded/completed for GW detection in the high-frequency band 
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(10-100 kHz; see Refs. 3-5 for a complete spectral classification of GWs).° Observa- 
tional data from Pulsar Timing Arrays (PTAs) are being accumulated for the first 
GW detection in the very low frequency band (300 pHz-100nHz).’ Collaborations 
working on Cosmic Microwave Background (CMB) observations are actively push- 
ing their sensitivities further for detecting imprints of primordial GWs in the Hubble 
frequency band (1 aHz-10fHz) on B-mode polarizations. LISA (Laser Interferom- 
eter Space Antenna)® Pathfinder!® launched on 3 December 2015 has successfully 
demonstrated the drag-free technology! for space detection of GWs in the middle 
and low frequency band (0.1 Hz-10 Hz; 100nHz-—0.1 Hz). The activities are mount- 
ing in this centennial year (2015-2016) of the establishment of general relativity. 

With the invention of lasers in 1960, the implementation of satellite laser rang- 
ing and lunar laser ranging in 1960s and the development of drag-free navigation 
for geodesy in 1970s, concept of laser interferometry in space for GW detection 
were developed in 1980s. The first public proposal on space interferometers for 
GW detection was presented at the Second International Conference on Preci- 
sion Measurement and Fundamental Constants (PMFC-II), 8-12 June 1981, in 
Gaithersburg.!?:'3 In this seminal proposal, Faller and Bender raised possible 
GW mission concepts in space using laser interferometry. Two basic ingredients 
were addressed — drag-free navigation for the reduction of perturbing forces on 
the spacecraft (S/C) and laser interferometry for the sensitivity of measurement. 
LISA-like S/C orbit formation was reached in 1985 in the proposal Laser Antenna 
for Gravitational-radiation Observation in Space (LAGOS).'* A schematic of LISA- 
type orbit configuration is shown in Fig. 1. It is natural for people like Bender and 
Faller working in lunar laser ranging and measuring free-fall acceleration using 
interferometry to propose such an experiment. In fact, test mass free fall inside a 
falling shroud in vacuum in the interferometric measurement of the Earth’s gravita- 
tional acceleration can be considered as a passive drag-free navigation device.!° The 
discrepancy in the absolute gravimeter comparison at the Bureau International des 
Poids et Mesures (BIPM) is partially resolved using correction to interferometric 
measurements of absolute gravity arising from the finite speed of light.!® In the 
S/C tracking, the finite velocity of light has always been incorporated. Both the 
test mass for GW missions and the test mass of interferometric gravimeter can be 
regarded as freely falling objects in the solar system and tracked using astrodynam- 
ical equation. Thus, we see the interplay among space geodesy, Galileo Equivalence 
Principle (Universality of Free Fall) experiments in space and GW detection mis- 
sions. Recent development for a GRACE follow-on mission SAGM (Space Advanced 
Gravity Measurements),!” TEPO!® (testing the equivalence principle with optical 
readout in space) and TIANQIN!?® (a space-borne GW detector) can be considered 
as such an example. 

A big step for the GW detection in space is the 1993 ESA M3 Assessment 
study of LISA and later recommendation as the third cornerstone of “Horizon 2000 
Plus”. After 2000, LISA became a joint ESA-NASA mission until the 2011 NASA 


GW detection in space 1-581 


Fig. 1. Schematic of LISA-type orbit configuration in Earthlike solar orbit.9 (For color version, 
see page I-CP14.) 


withdrawal. In 1998, LISA Pathfinder was selected as the second of the European 
Space Agency’s Small Missions for Advanced Research in Technology (SMART) to 
develop and to test the demanding drag-free technology. At this occasion of Cen- 
tennial Celebration of General Relativity, ESA has successfully launched the LISA 
Pathfinder on a Vega rocket from Europe’s spaceport in Kourou, French Guiana 
on 3 December 2015, and has successfully demonstrated the drag-free technology! 
for observing GWs from space. Based on the ongoing technological development 
for LISA Pathfinder, ESA has sponsored a technology reference study (completed 
in 2008) for the fundamental physics explorer as a common bus for fundamen- 
tal physics missions.2° New Gravitational-wave Observatory (NGO)/evolved LISA 
(eLISA),?! down-scaled from 5 million km to 1 million km arm length, was proposed 
in 2011 to accommodate the budget change and received excellent evaluation. In 
November 2013, ESA announced the selection of the Science Themes for the L2 
and L3 launch opportunities — the “Hot and Energetic Universe” for L2 and “The 
Gravitational Universe” for L3.2? ESA L3 mission is likely to have a launch oppor- 
tunity in 2034.7? Since eLISA/NGO GW mission concept is the major candidate 
at this time and it takes one year to transfer to the science orbit, a starting time 
for science phase is likely in 2035. Since 2035 is still 20 years away, it is not yet 
the time to freeze the specific mission concept. At present a comparison of laser 
measurement technology and atom interferometry is underway in ESA. 

The general concept of Astrodynamical Space Test of Relativity using Optical 
Devices (ASTROD) is to have a constellation of drag-free S/Cs navigate through 
the solar system and range with one another using optical devices to map the solar 
system gravitational field, to measure related solar system parameters, to test rela- 
tivistic gravity, to observe solar g-mode oscillations and to detect GWs. A baseline 
implementation of ASTROD was proposed in 1993 and has been under concept and 
laboratory studies since then.?* 8° In 1996, ASTROD I (Mini-ASTROD) with one 
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S/C ranging with ground stations was proposed for testing relativistic gravity and 
mapping the solar system.?? The mission study shows that the precision of test- 
ing relativistic gravity in the solar system is achievable to 1079-1078 in terms of 
Eddington parameter yy, which is more than three orders of improvement over the 
present precision, with accompanying improvement in other aspects of relativistic 
gravity.°! °° Early in 2009, responding to the call for GW mission studies of Chinese 
Academy of Sciences (CAS), a dedicated mission concept ASTROD optimized for 
Gravitational Wave detection (ASTROD-GW) for GW detection with 35/C (space- 
craft) orbiting near Sun—Earth Lagrange points L3, L4 and L5 respectively with 
nominal arm length of 260 million km was proposed and studied.*3* 4° A schematic 
of ASTROD-GW orbit configuration with inclination is shown in Fig. 2.34! Before 
the ASTROD-GW proposal, Super-ASTROD which was proposed in 19967? with 
5/C’s in Jupiterlike orbits was studied as a dual mission for GW measurement 
and for cosmological model/relativistic gravity test in 2008.4? With the proposal of 
ASTROD-GW, the baseline GW configuration of Super-ASTROD makes 3 out of 
4-5 S/C orbiting near Sun—Jupiter Lagrange points L3, L4 and L5, respectively. For 
the possibility of a down scaled version of ASTROD-GW mission, the ASTROD- 
EM with the orbits of 3S/C near Earth-Moon Lagrange points L3, L4 and L5 
respectively has been under study.*? 

DECi-hertz Interferometer GW Observatory (DECIGO)*4 was proposed in 2001 
with the aim of detecting GWs from early universe in the middle frequency obser- 
vation band between the terrestrial band and the low frequency band of other space 
GW detectors. It will use a Fabry-Perot method (instead of a delay line method) 
as in the ground interferometers but with a 1000km arm length. As a LISA follow- 
on, Big Bang Observer (BBO)* with arm length 50,000km was proposed in the 
United States with a similar goal. A likely version of DECIGO/BBO is to have 
12S/Cs with correlated detection. They will be used for the direct measurement 
of the stochastic GW background by correlation analysis.4° 6S/C-ASTROD-GW 
with two sets of ASTROD-GW has also been considered to possibly explore the 
relic GWs in the lower part of the low frequency band.?94° ALIA‘” of arm length 


S/C3 (near L5) 


Fig. 2. Schematic of ASTROD-GW orbit configuration with inclination. Left, projection on the 
ecliptic plane; Right, 3D view with the scale of vertical axis multiplied tenfold.2-4+ (For color 
version, see page I-CP15.) 
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500,000km was proposed as a less-ambitious LISA follow-on. TAIJI (also called 
ALIA descope)*® of arm length 3 million km has also been proposed and under 
study with the main goal of detecting intermediate mass black hole binaries at high 
redshift. 

After the end in 2011 of ESA-NASA partnership for flying LISA, NASA solicited 
“Concepts for the NASA Gravitational Wave Mission” proposals on 27 September 
2011 for study of low cost GW missions (http://nspires.nasaprs.com/external/). 
geosynchronous LISA/GEOstationary GRAvitational Wave Interferometer 
(gLISA/GEOGRAWI),*° *! Geostationary Antenna for Disturbance-Free Laser 
Interferometry (GADFLI),®? and Laser Gravitational-wave Antenna at Geo-lunar 
Lagrange points (LAGRANGE)? was proposed and Orbiting Medium Explorer 
for Gravitational Astronomy (OMEGA)°*°° re-emerged. OMEGA of arm length 
1 million km was first proposed as a low-cost alternative to LISA in the 1990s. 
An artist’s conception of the OMEGA mission configuration is shown in Fig. 3. In 
China, a GW mission in Earth orbit called TIANQIN!® of arm length 110,000km 
has been proposed and under study. 

Table 1 lists the orbit configuration, arm length, orbit period, S/C number, 
acceleration noise and laser metrology noise of various GW space mission proposals. 
Figures 4-1 show respectively the strain Power Spectral Density (PSD) amplitude 
[Sn(f)]|'/2 versus frequency plot, the characteristic strain h. versus frequency plot 
and the normalized GW spectral energy density Qew versus frequency plot for 
various GW detectors and sources in the low-frequency band and middle frequency 
band. The characteristic strain h,, the strain PSD amplitude [),(f)|'/? and the 
normalized GW spectral energy density Qey are related as follows: 


MH=F" eel: 


ge (A)= (Ep) Pu) = (Fez) Pr. @) 


3H? 


Bi 0.6-Gm-high circu lari” 
geocentric, near-ecliptic orbit 


Fig. 3. Schematic (left) and artist’s conception (right) of the OMEGA mission configuration.>> 
(For color version, see page I-CP16.) 


Table 1. A compilation of GW mission proposals. 
Mission concept S/C configuration Arm length Orbit period S/C # Acceleration Laser metrology 
noise [fm/s?/Hz!/2] noise [pm/Hz!/?] 
Solar-Orbit GW Mission Proposals 
LISA? Earthlike solar orbits 5Gm year 3 3 20 
with 20° lag 
eLISA?! Earthlike solar orbits 1Gm year 3 3 12(10) 
with 10° lag 
ASTROD-GWw?6-40 ear Sun-Earth 260 Gm year 3 3 1000 
L3, L4, L5 points 
Big Bang Observer*® Earthlike solar orbits 0.05 Gm year 12 0.03 14x 1075 
DECIGO*# Earthlike solar orbits 0.001 Gm year 12 0.0004 2% 10-8 
ALIA‘? Earthlike solar orbits 0.5Gm year 3 0.3 0.6 
TAIJI (ALIA-descope)** Earthlike solar orbits 3Gm year 3 3 5-8 
Super-ASTROD* Near Sun—Jupiter 1300 Gm ll year 4or5 3 5000 
L3, L4, L5 points 
(3 S/C), Jupiterlike 
solar orbit(s)(1-2 S/C) 
Earth-Orbit GW Mission Proposals 
OMEGA54;55 0.6 Gm height orbit 1Gm 53.2 days 6 3 5 
gLISA/GEOGRAWI*? 5! — Geostationary orbit 0.073 Gm 24h 3 3, 30 0.3, 10 
GADFLI®? Geostationary orbit 0.073 Gm 24h 3 0.3, 3, 30 1 
TIANQIN!9 0.057 Gm height orbit 0.11 Gm 44h 3 1 1 
ASTROD-EM?3 Near Earth-Moon 0.66 Gm 27.3 days 3 1 1 
L3, L4, L5 points 
LAGRANGES5$ Earth—Moon L3, L4, 0.66 Gm 27.3 days 3 3 5 


L5 points 
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Detailed accounts and explanations of Figs. 4-1 are given in Secs. 3-6 and in Ref. 5. 
A large part of these figures are taken from the corresponding low frequency band 
and middle frequency band of Figs. 2—4 in Ref. 5. 

In the following section, we discuss the link of gravity (including GW) with orbit 
observations/experiments in the solar system. In Sec. 3, we review the methods 
and the most recent experimental results of radio Doppler spacecraft tracking. In 
Sec. 4, we explain the basic principle of laser-interferometric space mission for GW 
detection. In Sec. 5, we address the sensitivity spectra and review basic noises. In 
Sec. 6, we discuss the scientific goals of GW space missions. In Sec. 7, we address 
the basic orbit design using eLISA and ASTROD-GW as concrete examples. In 
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Fig. 4. Strain PSD amplitude versus frequency for various GW detectors and GW sources. The 
black lines show the inspiral, coalescence and oscillation phases of GW emission from various 
equal-mass black-hole binary mergers in circular orbits at various redshift: solid line, z = 1; 
dashed line, z = 5; long-dashed line z = 20. See text for more explanation. [Cassini Spacecraft 
Doppler Tracking (CSDT); Supermassive Black Hole-GW Background (SMBH-GWB).] (For color 
version, see page I-CP14.) 
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Fig. 5. Characteristic strain he versus frequency for various GW detectors and sources. The 
black lines show the inspiral, coalescence and oscillation phases of GW emission from various 
equal-mass black-hole binary mergers in circular orbits at various redshift: solid line, z = 1; 
dashed line, z = 5; long-dashed line z = 20. See text for more explanation. [Cassini Spacecraft 
Doppler Tracking (CSDT); Supermassive Black Hole-GW Background (SMBH-GWB).] (For color 
version, see page I-CP15.) 


Sec. 8, we discuss the orbit design and orbit optimization using ephemerides. In 
Sec. 9, we discuss the deployment of spacecraft to various positions of Earthlike 
solar orbit, their propellant ratios and the total mass requirements. In Sec. 10, we 
discuss time delay interferometry (TDI). In Sec. 11, we discuss the payload. In 
Sec. 12, we summarize the paper and present an outlook. 


2. Gravity and Orbit Observations/Experiments in the Solar 
System 


Historically, the orbit and gravity observations/experiments in the solar system have 
been important resources for the development of fundamental physical laws as the 
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Normalized GW spectral energy density Qgw versus frequency for various GW detectors and GW 
sources. The black lines show the inspiral, coalescence and oscillation phases of GW emission from 
various equal-mass black-hole binary mergers in circular orbits at various redshift: solid line, z = 1; 
dashed line, z = 5; long-dashed line z = 20. See text for more explanation. [Cassini Spacecraft 
Doppler Tracking (CSDT); Supermassive Black Hole-GW Background (SMBH-GWB).] (For color 
version, see page I-CP16.) 


precision and accuracy are improved. It is so for both the developments of Newto- 
nian world system and Einstein’s general relativity.°* °* With the eminent improve- 
ment for orbit and gravity measurements pending, we are in a historical epoch for 
a great stride in the testing and development of fundamental laws. The gravita- 
tional field in the solar system is determined by three factors: the dynamic distri- 
bution of matter in the solar system; the dynamic distribution of matter outside 
the solar system (galactic, cosmological, etc.) and GWs propagating through the 
solar system. Different relativistic/cosmological theories of gravity make different 
predictions of the solar system gravitational field. Hence, precise measurements of 
the solar system gravitational field test these relativistic theories, in addition to 
enabling GW observations, determination of the matter distribution in the solar 


1-588 W.-T. Ni 


system and determination of the observable (testable) influence of our galaxy and 
cosmos. To measure the solar system gravitational field, we measure/monitor dis- 
tance between different natural and/or artificial celestial bodies. In the solar system, 
the equation of motion of a celestial body or a spacecraft is given by the astrody- 
namical equation 


a= an + a1pn + a2PN + AGal-Cosm + AGW + Anongrav; (2) 


where a is the acceleration of the celestial body or spacecraft, ay is the accelera- 
tion due to Newtonian gravity, a;py the acceleration due to first post-Newtonian 
effects, aopy the acceleration due to second post-Newtonian effects, acal-Cosm the 
acceleration due to Galactic and cosmological gravity, agw the acceleration due 
to GWs, and anongrav the acceleration from all nongravitational origins.* Distances 
between spacecraft depend critically on the solar system gravity (including gravity 
induced by solar oscillations), underlying gravitational theory and incoming GWs. 
A precise measurement of these distances as a function of time will enable the cause 
of variation to be determined. 

Ideally, it would be desirable to have a constellation of drag-free spacecraft 
navigate through the solar system and range with one another using optical devices 
(or other sensitive devices) to map the solar system gravitational field, to measure 
related solar system parameters, to test relativistic gravity, to observe solar g-mode 
oscillations, and to detect GWs.*° Practically, certain orbit configurations are 
good for testing relativistic gravity; certain configurations are good for measuring 
solar parameters; certain are good for detecting GWs. These factors are integral 
part of mission designs for various purposes.**°? 

To test relativistic gravity, the spacecraft needs to go into inner solar orbit where 
the solar gravity is stronger or to send signals passing near the solar limbs to get 
stronger influence from solar gravity. ASTROD I during the superior solar conjunc- 
tions to measure the Shapiro delay of light and with continuous laser ranging of 
1mm accuracy to improve the determination of relativistic parameters is such a 
mission proposal.*! 3° BepiColombo to be launched in 2017 is an ESA-JAXA mis- 
sion under implementation.©°*! One of its goals of radio science is to test relativistic 
gravity. In determining its orbit about Mercury, it will indirectly find the motion of 
the center of mass of Mercury with an accuracy several orders of magnitude better 
than what is possible by radar ranging to its surface. This is a good opportunity to 
measure Mercury’s perihelion advance and the Shapiro time delay, and to improve 
on the other post-Newtonian parameters by a couple of orders of magnitude.® 

To measure or to improve solar and planetary parameters, the spacecraft needs 
to go near the measured body or to have supreme sensitivity. NEAR (Near Earth 
Asteroid Rendezvous Mission: determined the mass (6.687 + 0.003) x 10!8 gm and 
density 2.67 + 0.03gm/cm° of asteroid 433 Eros, its lower order gravitational- 
harmonics, and its rotation state using ground-based Doppler and range track- 
ing of the NEAR spacecraft orbiting Eros together with images of the asteroid’s 
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surface landmarks), MESSENGER (MErcury Surface, Space ENvironment, GEo- 
chemistry, and Ranging: entered orbit around Mercury on March 18, 2011, deorbited 
as planned, and impacted the surface of Mercury on April 30, 2015. During this 
period, MESSENGER measured the gravity of Mercury and the state of the plane- 
tary core by utilizing the spacecraft’s positioning data.) and ASTROD I (a mission 
proposal having a Venus swing-by for gravity assistance and for improved measure- 
ment of Venus gravity/multipole moments, with laser ranging of accuracy about 
1mm for improvement on the parameter determination of planets and asteroids) 
— are such examples. 

For laser-interferometric GW detection without fast Doppler tracking (e.g. using 
optical combs), nearly equal arm lengths are required; LISA-like mission concepts 
and ASTROD-GW-like mission concepts are examples. 


3. Doppler Tracking of Spacecraft 


Radio Doppler tracking of spacecraft in a space mission can be used to constrain (or 
detect) the level of low-frequency GWs. The separated test masses of this GW detec- 
tor are the Doppler tracking radio antenna on Earth and a distant S/C. Doppler 
tracking measures relative distance change. From these measurements, GWs can 
be detected or constrained. In 1967, Braginsky and Gertsenshtein® first proposed 
to use Doppler data of spacecraft tracking for GW searches. In 1971, Anderson® 
pursued this method of search with preexisting data. Davis®” worked out the GW 
response of Doppler tracking for special cases in 1974; Estabrook and Walquist®® 
analyzed the effect of GWs passing through the line-of-sight of S/C on the Doppler 
tracking frequency measurements in general in 1975 (see also Ref. 69). 

In Doppler tracking of S/C, a highly stable master clock on Earth is used as a 
reference to control a monochromatic radio wave for transmitting to S/C (uplink). 
When S/C transponder receives the monochromatic radio wave, it phase-locks the 
local oscillator with or without a frequency offset and transponds the local oscillator 
signal back (to Earth station; downlink) coherently. 

The one-way Doppler response y(t) is defined as 

y(t) = % = WO v0), (3) 


40) 40) 


where 1 is the frequency of emitted signal and 1; is the frequency of received 
signal. Far from the GW sources as it is in the present experimental/observational 
situations, the plane wave approximation is valid. For weak plane waves propagating 
in the z-direction in general relativity, we have the following spacetime metric: 


ds* = dt? — (6;; + hij(ct — z))dx'da?, |hiz| <1, (4) 


where Latin indices run from 1 to 3 and sum over repeated indices is assumed. 
Estabrook and Walquist®*®® derived the one-way and two-way Doppler responses 
to plane GWs in weak field approximation (4) in the transverse traceless gauge in 
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general relativity. Written in the notation of Armstrong et al." the formula for 
one-way Doppler response on board S/C 2 received from S/C 1 is 


yt) =(1-k-n)[W(t-(1+k-n)L) — V(e)], (5) 


where k[= (k') = (k', k’, k°)] is the unit vector in the GW propagation direction, 
n|= (n') = (n',n?,n3)] the unit vector along the link from spacecraft 1 to space- 
craft 2 and L is the path length of the Doppler link. The function W(t) is defined as 

n' hy; (t)n4 
{2[1 — (k-n)?]} 
With one-way Doppler response known, two-way and multiple way response can 
easily be written down. As noticed and derived by Tinto and da Silva Alves,”! for 
GW solutions in any metric theories of gravity of the form (4), the Doppler response 
formula (5) with the definition (6) is valid also. 

Doppler tracking of the Viking S/C (S-band, 2.3 GHz),”? the Voyager I S/C (S- 
band uplink + coherently transponded S-band and X-band (8.4GHz) downlink), 
Pioneer 10 (S band), and Pioneer 11 (S band)” have been used for GW measure- 
ment and have given constraints on GW background in the low-frequency band. 

The most recent measurements came from the CSDT. Armstrong et al.”° used 
the Cassini multilink radio system during 2001-2002 solar opposition to derive 
improved observational limits on an isotropic background of low-frequency GWs. 
The Cassini multilink radio system consists of a sophisticated multilink radio system 
that simultaneously receives two uplink signals at frequencies of X and Ka bands 
and transmits three downlink signals with X-band coherent with the X-band uplink, 
Ka-band coherent with the X-band uplink, and Ka-band coherent with the Ka-band 
uplink. X band is a standard deep space communication frequency band about 
8.4GHz; Ka band is another deep space communication frequency band about 
32GHz. Armstrong et al." used the Cassini multilink radio system with higher 
frequencies and an advanced tropospheric calibration system to remove the effects 
of leading noises — plasma and tropospheric scintillation to a level below the other 
noises. The resulting data were used to construct upper limits on the strength of an 
isotropic background in the 1 wHz-1mHz band.”® The characteristic strain upper 
limit curve labeled CSDT in Fig. 5 is a smoothed version of the curve in Fig. 4 of 
Ref. 76. The corresponding CSDT curves on the strain PSD amplitude in Fig. 4 
and the normalized spectral energy density in Fig. 1 are calculated using Eq. (1) 


V(t) = 


(6) 


for conversion. The minimal points on these curves are 
[Sa(f)i/? <8x10-'3, at several frequencies in the 0.2-0.7mHz band; 
he(f) <2x10-'°, at frequency about 0.3 mHz; (7) 
Qew(f) < 0.03, at frequency 1.2 wHz. 


The GW sensitivity of spacecraft Doppler tracking could still be improved by 
1-2 order of magnitude with a space borne optical clock on board.”” 
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In the radio tracking of spacecraft, the received frequency of the signals is 
tracked. Its integral is the phase. In the radio ranging of spacecraft, the received 
phase of the signals is measured. The derivative of the phase is the frequency. For 
coherent transponding, the phase measured is basically a range up to an additive 
constant which needs to be determined. 

Pulse laser ranging. Another way to measure the range is by using pulse timing. 
This is what being done in satellite laser ranging and lunar laser ranging. For rang- 
ing through the Earth’s atmosphere, the best way to find the atmospheric delay is 
to use two colors (two wavelengths) to measure the atmospheric delay and subtract 
it. The distance determination of satellite laser ranging with two colors (two wave- 
lengths) has reached millimeter accuracy. With the newer generation of lunar laser 
ranging,”*”° the accuracy of lunar distance determination has also reached mil- 
limeter accuracy. On board timing accuracy of 3 ps (0.9mm) has already achieved 
by the Time Transfer by Laser Link (T2L2) event timer onboard Jason 2 satel- 
lite.8°5! Based on these developments, the one-way ranging technical capability 
over the whole solar system could have a millimeter accuracy. With this accuracy 
and extended ranges of 20 AU, the capability of probing the fundamental laws of 
spacetime and mapping the solar system gravity will be greatly enhanced.*? °° For 
1mm out of 20 AU, the fractional uncertainty is 3 x 10~1®. It requires laser stabil- 
ity and clock accuracy to reach this level of fractional uncertainty; the accuracy is 
already achieved in the laboratory and will be available in space. ASTROD I?! °° 
using a space borne precision clock has included as one of its goals GW sensitivity 
improvement of the CSDT by one order of magnitude. In fact, the fractional accura- 
cies of optical clocks have already reached the 10~!8 level. When space optical clocks 
reach this level, pulse laser ranging together with drag-free technology will be an 
important alternative for detection of GWs in the lower part of low frequency band. 

The basic principle of spacecraft Doppler tracking, of spacecraft laser ranging, 
of space laser interferometers, and of Pulsar Timing Arrays (PTAs) for GW detec- 
tion are similar. In the development of GW detection methods, spacecraft Doppler 
tracking method and pulse laser ranging method have stimulated significant inspi- 
rations. The methods using space laser interferometers and using PTAs are becom- 
ing two important methods of detecting GWs. The PTAs and their sensitivity are 
addressed in Refs. 5 and 7. Interferometric space missions and their sensitivities 
will be addressed in the following section. 


4. Interferometric Space Missions 


In a Michelson interferometer, the wave front is split into two parts to go in two 
different paths and then the two wave fronts are recombined to interfere. For white 
light, Michelson had to match the two optical path lengths very precisely in order 
to have interference fringes. After laser was invented, the coherence length became 
longer. One could build unequal arm Michelson interferometer. An alternative con- 
figuration of the Michelson interferometer is the Mach-Zehnder Interferometer. 
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Two-way Doppler tracking can be considered as an unequal arm Michelson interfer- 
ometer; the local oscillator splits off a beam directing to the uplink spacecraft and 
the return beam from the spacecraft transponder interferes with the local oscilla- 
tor. The phase (and frequency) of the beat is measured as a function of time. The 
Doppler response of a single link is given by (5). Using (5) the response of two-way 


Doppler tracking®*® is given by 


y(t) = -( —k-n)W(t) -2(k-n)Ut-(+k-n)L)+(1+k-n)U(t—2L). 
(8) 
The three terms in (8) correspond, respectively, to the projected amplitude of 
the wave at the event of reception of the Doppler tracking signal at Earth, the 
transponding event at the spacecraft, and the emission event of the tracking signal 
from Earth. 

Since the deviation of the speed of the electromagnetic wave from that of vac- 
uum in plasma is inversely proportional to the square of the frequency, the time 
uncertainty due to solar wind or ionized gas in the microwave propagation is smaller 
in the Ka band (32 GHz) and X band (8.4 GHz) than S band (2.3 GHz). This is one 
of two motivations for Doppler tracking of Cassini spacecraft to use Ka band and X 
band for better noise performance. The other motivation is with shorter wavelength, 
the measurement precision increases. At optical frequency, the wavelength is more 
than fourth-order smaller and the plasma effect is eighth-order smaller. Therefore, 
when better sensitivities in the optical path length measurement was needed in GW 
detection, the GW community started to use optical method. When sensitivity is 
increased, we need to suppress spurious noise below the aimed sensitivity level. 
This requires that (i) we reduce the acceleration noise and implement the drag-free 
technology; (ii) we reduce the laser noise as much as possible. The basic drag-free 
technology is now demonstrated by LISA Pathfinder.'! For reducing laser noise, we 
need laser stabilization. The best way is to implement absolute stabilization; e.g. 
to lock to an iodine molecular line. However, laser stabilization alone is not enough 
for the required strain sensitivity of the order of 10~?!. To lessen the laser noise 
requirement, TDI came to rescue. 

For space laser-interferometric GW antenna, the arm lengths vary according 
to solar system orbit dynamics. In order to attain the requisite sensitivity, laser 
frequency noise must be suppressed below the secondary noises such as the optical 
path noise, acceleration noise etc. For suppressing laser frequency noise, it is neces- 
sary to use TDI in the analysis to match the optical path length of different beams 
closely. The better match of the optical path lengths are, the better cancellation 
of the laser frequency noise and the easier to achieve the requisite sensitivity. In 
case of exact match, the laser frequency noise is fully canceled, as in the original 
Michelson interferometer. 

The TDI was first used in the study of ASTROD mission concept.?%:?%?6 In 
the deep-space interferometry, long distances are invariably involved. Due to long 
distances, laser light is attenuated to a great extent at the receiving spacecraft. 
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To transfer the laser light back or to another spacecraft, amplification is needed. 
The procedure is to phase lock the local laser to the incoming weak laser light 
and to transmit the local laser light back or to another spacecraft. Liao et al.?9°° 
have demonstrated the phase locking of a local oscillator with 2-pW laser light in 
laboratory. Dick et al.8? have demonstrated phase locking to 40-fW incoming weak 
laser light. The power requirement feasibility for both e-LISA/NGO and ASTROD- 
GW is met with these developments. In the 1990s, Ni et al.?*:?>-76 used the following 
two TDI configurations during the study of ASTROD interferometry and obtained 
numerically the path length differences using Newtonian dynamics. 

These two TDI configurations are the unequal arm Michelson TDI configuration 
and the Sagnac TDI configuration for three spacecraft formation flight. The princi- 
ple is to have two split laser beams to go to Paths 1 and 2 and interfere at their end 
path. For unequal arm Michelson TDI configuration, one laser beam starts from 
spacecraft 1 (S/C1) directed to and received by spacecraft 2 (S/C2), and optical 
phase locking the local laser in S$/C2; the phase locked laser beam is then directed 
to and received by S/C1, and optical phase locking another local laser in S/C1; and 
so on following Path 1 to return to S/C1: 


Path 1: 8/Cl > 8/C2 > 8/C1 > $/C3 > $/C1. (9) 


The second laser beam starts from $/C1 also, but follows Path 2 route: 


Path 2: $/Cl > $/C3 > $/C1l > $/C2 > $/C1, (10) 


to return to S/C1 and to interfere coherently with the first beam. If the two paths 
has exactly the same optical path length, the laser frequency noises cancel out; if 
the optical path length difference of the two paths are small, the laser frequency 
noises cancel to a large extent. In the Sagnac TDI configuration, the two paths are: 


Path 1: S/C1 — S/C2 — $/C3 — S/C1, 
Path 2: $/C1 — S/C3 — $/C2 — S/C1. 


(11) 


Since then we have worked out the same things numerically for LISA,®? 
eLISA/NGO,** LISA-type with 2 x 10°km arm length,°+ ASTROD-GW with no 
inclination,®*°6 and ASTROD-GW with inclination.“ 

TDI has been worked out for LISA much more thoroughly on various aspects 
since 1999.8788 First-generation and second-generation TDIs are proposed. In 
the first-generation TDIs, static situations are considered, while in the second- 
generation TDIs, motions are compensated to certain degrees. The two configu- 
rations considered above are first-generation TDI configurations in the sense of 
Armstrong et al.8%88 We will discuss numerical TDI more in Sec. 10. For many 
other aspects of TDI, we refer the readers to the excellent review.®® 

In Table 1, we have compiled various interferometric space mission proposals 
for GW detection. Among the proposed science orbits, there are basically three 
categories — ASTROD-GW-like, LISA-like and OMEGA-like. 
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(i) LISA-like (LAGO-like)!* science orbits: As in Fig. 1, the Earth-like solar 
orbits of the three spacecraft are appropriately inclined so that they form a nearly 
equilateral triangle formation having a tilt of 460° (in the figure, the tilt is 60°) with 
respect to the ecliptic plane.'+ The formation rotates once per year clockwise or 
counterclockwise facing the Sun. Sections 7 and 8 give more detailed orbit analysis. 
LISA,® eLISA,?4 ALIA*” and TAIJI (ALIA-descope)*® have this kind of LISA-like 
science orbits. The ultimate configuration of Big Bang Observer*® and DECIGO*4 
has 12 spacecraft distributed in the Earth orbit in three groups separated by 120° 
in orbit; two groups has three spacecraft each in a LISA-like triangular formation 
and the third group has six spacecraft with two LISA-like triangles forming a star 
configuration (Fig. 6). An alternate configuration is that each group has four space- 
craft forming a nearly square configuration (also has a tilt of 60° with respect to 
the ecliptic plane). 

(ii) OMEGA-like science orbits: These orbits are Earth orbits away from (either 
inside or outside) Moon’s orbit around the Earth. An example is the OMEGA 
mission orbit configuration. OMEGA mission proposed to NASA as a candidate 
MIDEX mission in 1998, and again as a mission-concept white paper in 2011. 
The OMEGA®*°5 mission consists of six identical spacecraft in a 600,000 km-high 
Earth orbit, two spacecraft at each vertex of a nearly equilateral triangle forma- 
tion (Fig. 3). These orbits are stable, allowing for three years of planned science 
operations, as well as the possibility of an extended mission if desired. The arm 
length of the triangle formation is about 1 million km (1 Gm). The mission forma- 
tion is outside of Moon’s orbit. 


Fig. 6. Two schematic configurations of BBO and DECIGO in Earth-like solar orbits. 
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There are two mission proposals — GEOGRAWI*®/gLISA®*! and GADFLI*? 
using geostationary orbit formation. The three spacecraft of the formation are in 
the geostationary orbits forming a nearly equilateral triangle with arm length about 
73,000 km. 

TianQin is a GW mission proposal with 57,000 km-high orbit. The three space- 
craft form a nearly equilateral triangle with arm length about 110,000km with 
Earth-orbiting period 44h.'° 

The orbits and spacecraft configuration of all these missions are near ecliptic 
plane. There are times the Sun light comes along the line-of-sight of telescope links. 
Sunlight shields are required when the line-of-sight cross the Sun. A solution has 
been proposed from the OMEGA mission proposal®* which could be used for other 
missions in this category. 

(iii) ASTROD-GW-like science orbits: The basic ASTROD-GW configuration 
consists of three spacecraft in the vicinity of the Sun—Earth Lagrange points L3, 
L4 and L5 respectively with near-circular orbits around the Sun, forming a nearly 
equilateral triangle as shown by Fig. 2 with the three arm lengths about 2.6 x 10° km 
(1.732 AU).336 4° The dominant force on the spacecraft is from the Sun in the 
restricted three-body problem of Earth—Sun-spacecraft system. Since the Earth— 
Sun orbit is elliptical, the Lagrange points are not stationary in the Earth—Sun 
rotating frame. The motion of test particles at L3, L4 and L5 deviates from circular 
orbit by a fraction of O(e) where e (= 0.0167) is the eccentricity of the Earth orbit 
around the Sun. However, the spacecraft can be in the halo orbit of the respective 
Lagrange points largely compensating the nonstationary motion of the Lagrange 
points to remain nearly circular orbits of the Sun. The circular orbits of spacecraft 
near the L3, L4 and L5 points are stable or virtually stable in 20 years (their orbits 
are also stable or quasi-stable with respect to their respective Lagrange points so 
that the deviations from circular orbit of their respective Lagrange point are of 
the order of O(e?) in AU) and the deviation of the spacecraft triangle from an 
equilateral triangle is of order of O(e?) in arm length. For a nonprecession planar 
formation, the angular resolution has antipodal ambiguity. To resolve this issue, we 
need to have precession orbit formation inclined with respect to the ecliptic. When 
the orbits of spacecraft have a small inclination \ (in radians) with respect to the 
ecliptic plane the arm length variation is of the order of O(A?). Therefore, the added 
variation due to these two causes is of the order O(e?, \*). For these two causes 
to match (to O(10~*)), A should be of the order of O(1°). In Sec. 7, we review the 
inclined orbit analytically in the solar gravitational field and explain the angular 
resolution together with how to resolve the antipodal ambiguity. In Sec. 8, we will 
use solar system ephemeris to design and optimize the orbit configuration and will 
see that the perturbation from all planets except Earth is of the order of O(10~*). 
The influence of Earth is already taken into consideration since the L3, L4 and 
L5 points are effectively stable in 20 years. Hence, suitable inclined circular orbits 
could be our basic orbits to start with and the deviations from actual optimized 
orbit should be on the order of O(10~*). 


1-596 W.-T. Ni 


For Super-ASTROD,*” we could also place the three spacecraft with small incli- 
nation angle to Jovian solar orbit plane near Sun—Jupiter L3, L4 and L5 points with 
the other 1 or two spacecraft having large inclination(s). 

For ASTROD-EM,** the three spacecraft will be placed in near Earth—Moon L3, 
L4 and L5 points. For the spacecraft dynamics, we have restricted 4-body (Earth, 
Moon, Sun, and the spacecraft whose gravitational filed can be neglected) problem 
to work out. 


5. Frequency Sensitivity Spectrum 


The space GW detectors are basically real-time free-mass detectors. As we have 
already discussed in Ref. 5 in general, there are two crucial issues in these proposed 
detectors: (i) to lower the disturbance effects and/or to model them for subtraction: 
drag-free to decrease the effects of surrounding disturbances, and appropriate mod- 
eling of the motion and the disturbances for subtraction to lower the residuals; (ii) 
to increase measurement sensitivity: microwave sensing, optical sensing, X-ray sens- 
ing, atom sensing, molecule sensing and timing. Associated with these two issues, 
there are two basic noises — the acceleration noise and the metrology noise. For 
laser-optic missions, the metrology noise is the laser metrology noise. The planned 
upper limits of these two kinds of basic noise for GW mission proposals are listed 
in the last two columns of Table 1. In space GW detection, the basic noise model 
is the LISA/eLISA noise model. Due to more stringent technological requirements, 
Big Bang Observer and DECIGO belong to second-generation space detector pro- 
posals. Super-ASTROD is also a second-generation space detector proposal due to 
its distance and power requirements. All others in Table 1 are first-generation space 
detector proposals. In Figs. 4-1, we plot the sensitivity curves of three typical first- 
generation space detectors (LISA/eLISA, ASTROD-GW and OMEGA) and two 
second-generation space detectors (Big Bang Observer and DECIGO). In the first 
generation category, for missions with arm length shorter than LISA, the planned 
strain upper limits are smaller than that of LISA in the higher frequency part; for 
missions with arm length longer than LISA, the planned strain upper limits are 
smaller than that of LISA in the lower frequency part. 

As shown in Fig. 4, typical frequency sensitivity spectrum of strain PSD ampli- 
tude for space GW detection consists of three regions, the acceleration/vibration 
noise dominated region, the shot noise (flat for current space detector projects like 
LISA in strain PSD) dominated region, if any, and the antenna response restricted 
region. The lower frequency region for the detector sensitivity is dominated by 
vibration, acceleration noise or gravity-gradient noise. The higher frequency part 
of the detector sensitivity is restricted by antenna response (or storage time). In a 
power-limited design, sometimes there is a middle flat region in which the sensitivity 
is limited by the photon shot noise.%:?%:40 

The shot noise sensitivity limit in the strain for GW detection is inversely pro- 
portional to P!/2L with P the received power and L the distance or arm length. 
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Since P is inversely proportional to L? and P!/?L is constant, this sensitivity limit 
is independent of the distance. For 1-2W emitting power, the limit is around 
10-7! Hz—!/2. As noted in the LISA study,? making the arms longer shifts the 
time-integrated sensitivity curve to lower frequencies while leaving the bottom of 
the curve at the same level. Hence, ASTROD-GW with longer arm length has 
better sensitivity at lower frequency. e-LISA, ALIA, TAIJI (ALIA-descope), and 
GW interferometers in Earth orbit have shorter arms and therefore have better 
sensitivities at higher frequency. 

In Figs. 4-1, we plot sensitivity curves for LISA, e-LISA and ASTROD-GW 
for the low-frequency GW band. In the Mock LISA Data Challenge (MLDC) 
program, the consensus goal for the LISA instrumental noise density amplitude 
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where Ly, = 5 x 10°m is the LISA arm length, f, = c/(27L1) is the LISA arm 
transfer frequency, Spp = 4 x 10~??m*Hz~! is the LISA (white) position noise 
(power) level due to photon shot noise, and S, = 9 x 10-3° m?s~4Hz~! is the LISA 
white acceleration noise (power) level.®? Note that (12a) contains the “reddening” 
factor [1 + (10~4 Hz/f)?] in the acceleration noise term. 
In 2003, Bender” looked into the possible LISA sensitivity below 100 wHz. From 
a careful analysis of noises of test mass and capacitive sensing, Bender suggested a 
specific sensitivity goal at frequencies down to 3 wHz which contained a milder (than 
MLDC) “reddening factor”. For frequency between 10 wHz and 100 Hz, he sug- 
gested to put in the “reddening factor” [(10~4 Hz/f)!/?] and for frequency between 
3 Hz and 10 Hz, the “reddening factor” [3.16 x (10~° Hz/f)]. To drop this “red- 
dening factor” might be difficult. However, with monitoring the gap of capacitive 
sensing and the positions of major mass distribution, the factor may be allevi- 


ated to certain extent. To completely drop the factor or to go beyond, one may 
need to go to optical sensing and optical feedback control.24?"78-9! 93 If we drop 
the “reddening factor”, the enhanced LISA instrumental noise density amplitude 
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After NASA’s withdrawal from ESA-NASA collaboration of LISA in 2011, the 
European eLISA/NGO for space detection of GWs emerged. The orbit configuration 
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is the same as LISA, but with arm length shrunk five times to 1 million km, the 
orbits slowly drifting away from the Earth and the nominal mission duration two 
years (extendable to five years) to save weight, fuel and costs. The three spacecraft 
will consist of one “mother” and two simpler “daughters,” with interferometric mea- 
surements along only two arms with the “mother” at the vertex.?4 The eLISA/NGO 
strain noise PSD goal is also shown in Fig. 4. For the lower frequency part of the 
power spectrum of eLISA/NGO, we choose to use the same acceleration noise with 
reddening factor (solid line) and without reddening factor (dashed line) as those of 
LISA to obtain the eLISA/NGO strain noise for easy comparison. 

The eLISA arm length Ley is five times shorter. Its instrumental noise density 
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Let, 

where L.1, = 10°m is the eLISA arm length, fer, = c/(27Lex) is the eLISA arm 
transfer frequency, Setp = 1 x 10-7? m? Hz~! is the eLISA (white) position noise 
level due to photon shot noise assuming that the telescope diameter is 25cm 
(compared with 40 cm for that of LISA) and that the laser power is the same 
as LISA. With these assumptions, the eLISA position noise amplitude would be 
10 pm/Hz!/? listed in parentheses in the eLISA entry, comparable to 12 pm/Hz!/? 
used in Ref. 94. The corresponding enhanced eLISA instrumental noise density 
amplitude (Eahanced)¢.5 .1/2(f) is 
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For ASTROD-GW, our goal on the instrumental strain noise density ampli- 


tude is 
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over the frequency range of 100nHz < f < 1Hz. Here La = 260 x 10°m is the 
ASTROD-GW arm length, fa = c/(27L,) is the ASTROD-GW arm transfer fre- 
quency, S, = 9 x 10-99 m?s~* Hz! is the white acceleration noise level (the same 
as that for LISA), and Sap = 10,816 x 10-7? m? Hz~! is the (white) position noise 
level due to laser shot noise which is 2704 (= 527) times that for LISA.*36 4° The 
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corresponding noise curve for the ASTROD-GW instrumental noise density ampli- 
tude (MLDC) 9, 1/ ?(f) with the same “reddening” factor as specified in MLDC 


program is 
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over the frequency range of 100nHz < f < 1Hz. The sensitivity curves from the six 
formulas (12a), (12b) to (14a) are shown in Fig. 4. The corresponding sensitivity 
curves in terms of h,(f) and Qegw(f) are shown in Figs. 5 and 1, respectively. The 
ones with reddening factor are shown with dashed line in the lower frequency part. 

With the same laser power as that of LISA, the ASTROD-GW sensitivity would 
be shifted to lower frequency by a factor up to 52 if other frequency-dependent 
requirements can be shifted and met. The sensitivity curve would then be shifted 


+ 


toward lower frequency as a whole. Since the main constraints on the lower fre- 
quency part of the sensitivity is from the accelerometer noise, this translational 
shift depends on whether the accelerometer noise requirement for ASTROD-GW 
could be lowered (more stringent) from that of LISA requirement at a particular 
frequency. Since ASTROD is in a time frame later than LISA, if the absolute metro- 
logical accelerometer /inertial sensor could be developed, there is a potential to go 
toward this requirement. However, to be simple, we have taken a conservative stand 
and assume that the LISA accelerometer noise goal and all other local requirements 
are taken as they are in the above equations and in the plotting of sensitivity curves 
in Figs. 4-1. Since the strain sensitivity is mainly the accelerometer noise divided by 
arm length at low frequency, at a particular low frequency limited by accelerometer 
noise, the strain sensitivity for ASTROD-GW is 52 times lower than LISA (or 260 
times lower than eLISA) due to longer arm length whether we take (12a) (or (13a)) 
and (14a) to compare or (12b) (or (13b)) and (14) to compare. With better lower- 
frequency resolution, the confusion limit of Galactic compact binary background for 
ASTROD-GW would be somewhat lower than that for LISA. The confusion limit 
for eLISA would be somewhat higher than that for LISA. In Figs. 4-1, the confu- 
sion limit curves are for LISA. ASTROD-GW will complement LISA and PTAs in 
exploring single events and backgrounds of MBH—MBH binary GWs in the impor- 
tant frequency range 100 nHz-1 mHz to study black hole co-evolution with galaxies, 
dark energy and other issues (Sec. 6). 

OMEGA has 1 million km arms just as eLISA. The sensitivity goal of OMEGA 
is: (i) The acceleration noise PSD is the same as LISA and eLISA; (ii) the (white) 
position noise amplitude is fourfold lower than LISA and twofold lower than eLISA. 
The sensitivity curve of OMEGA plotted on Fig. 4 is from Ref. 55 with correspond- 
ing curves shown on Figs. 5 and 1. The lower frequency part and the flat part 
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are close to eLISA while the antenna-response-limited part is slightly better. The 
small difference as compared to that in Table 1 may be because of OMEGA has 
three pairs of S/C with one more link for interferometry or just because of different 
sources of drawing. 

For GW mission proposals listed in Table 1 with formations inside the lunar 
orbit around the Earth, the acceleration noise requirements are about the same 
level or slightly more stringent than OMEGA and eLISA while the requirement 
on the position noise amplitude is lower because of more power received. The goal 
sensitivity curves in the higher frequency part is slightly better for the two mission 
proposals in geostationary orbits, gLISA/GEOGRAWI and GADFLI. TIANQIN 
with 0.11 Gm arm length aims at first and sure detection of a GW source in space, 
the required sensitivity on S,1/? and oe are S,!/2 =1x 107! ms-? Hz-!/? and 
ge =1pmHz~!/? at 6mHz. 

ALIA in solar orbit as a LISA follow-on aims at better sensitivity at frequency 
above 1mHz. It has arm length of 0.5Gm (0.5 million km) — ten times shorter 
than LISA and two times shorter than eLISA. The acceleration noise requirement 
is tenfold more stringent than LISA, i.e. S,!/? = 0.3 x 10-5 ms-2Hz~!/?. The 
position noise amplitude requirement is 30 times more stringent than LISA, i.e. 
Sp? = 0.6 x 107 pmHz-!/?. TAIJI (ALIA-descope) has arm length of 3Gm 
and aims at a detection of intermediate black hole coalescence in addition to other 
scientific goals common to most space mission proposals. Its sensitivity is relaxed 
from ALIA to S,'/? = 3 x 10-18 ms~? Hz~!/? (the same as LISA) and S,'/? = 5- 
8pm Hz71/?, 

The three spacecraft of ASTROD-EM and of LAGRANGE will be located near 
L3, L4 and L6 Lagrange points of Earth-Moon system, respectively. Due to the 
inclination of the Moon-Earth orbit plane to the ecliptic, the spacecraft formation 
plane will not intersect the Sun. Hence, unlike other missions in Earth orbit, the 
Sun light will not come along the line-of-sight of telescope links. Sunlight shields are 
not required. The spacecraft orbit dynamics is a restricted 4-body (Earth, Moon, 
Sun and the spacecraft) problem which we are still working on.*? The acceleration 
noise and the laser metrology noise requirements are listed in Table 1. 

BBO and DECIGO have similar goals of detecting primordial GWs. BBO 
has a delay line implementation. DECIGO uses a Fabry—Perot implementation. 
The acceleration noise S,!/* and the laser metrology noise aad ? requirements 
of BBO are S,)/? = 3 x 10-17 ms~2Hz"/? and S,1/? = 1.4 x 10-5 pmHz~1/2, 
respectively; those of DECIGO are Sai? = 4x 10-19 ms~?Hz-!/2 and oe = 
2x 10~° pm Hz~!/?. The strain sensitivity curve of a single DECIGO interferometer 
as shown in Fig. 5 is from Ref. 95. BBO has a similar single-interferometer sensitiv- 
ity curve. One-sigma, power-law integrated sensitivity curve for BBO (BBO-corr) as 
shown in Fig. 5 is obtained by Thrane and Romano.®® That of DECIGO is similar. 
We also put in the plot their LISA autocorrelation measurement sensitivity curve 
(LISA-corr) in a single detector assuming perfect subtraction of instrumental noise 
and/or any unwanted astrophysical foreground.9® The minimum autocorrelation 
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sensitivity using the same method for ASTROD-GW is also estimated and plotted 
in Fig. 5; this would also be the level that 6S/C ASTROD-GW“ (6S/C ASTROD- 
GW-corr) could reach. All of the corresponding curves are plotted in Figs. 4 and 1. 
Considering the sensitivity requirements or arm length involved, DECIGO, BBO 
and Super-ASTROD belongs to the second-generation space interferometers. For 
the sensitivity of Super-ASTROD, we assume Sai? =3 x 10-5 ms? Hz1/? (the 
same as LISA) and S$,'/? = 5000 pm Hz~?/2. 

Atom Interferometry. The development in atom interferometry is fast and 
promising. It already contributes to precision measurement and fundamental 
physics. A proposal using atom interferometry to detect GWs has been raised at 
Stanford University as an alternate method to LISA on the LISA bandwidth.9”% 
Issues have arisen on its realization of LISA sensitivity for this proposal.?9!0° In 
Observatoire de Paris, SYRTE has started the first stage of its project — Matter- 
wave laser Interferometric Gravitation Antenna (MIGA)!! of building a 300m long 
optical cavity to interrogate atom interferometers at the underground laboratory 
Laboratoire Souterrain 4 Bas Bruit (LSBB) in Rustrel. In the second stage of the 
project (2018-2023), MIGA will be dedicated to science runs and data analyses in 
order to probe the spatio-temporal structure of the local field of the LSBB region. 
In the meantime, MIGA will assess future potential applications of atom interfer- 
ometry to GW detection in the middle frequency band (0.1—10 Hz). 


6. Scientific Goals 


In this section, we review and summarize the scientific goals for space GW mission 
proposals and projects.*:9:7139:49.94 More studies on the scientific goals and data 
analysis in the next few years will be worthy for the preparation of space GW 
missions. 


6.1. Massive black holes and their co-evolution with galaxies 


Relations have been discovered between the MBH mass and the bulge mass of 
host galaxy, and between the MBH mass and the velocity-dispersion of host galaxy. 
These relations indicate that the central MBHs are linked to the evolution of galactic 
structure. Observational evidence indicate that MBHs reside in most local galaxies. 
Newly fueled quasar may come from the gas-rich major merger of two massive 
galaxies. GW observation in the low frequency band (100nHz-100mHz) by space 
interferometers and very low frequency band (300 pHz-100nHz) by PTAs will be a 
major tool to study the co-evolution of galaxy with BHs. 

The standard theory of MBH formation is the merger-tree theory with vari- 
ous Massive Black Hole Binary (MBHB) inspirals acting. The GWs from these 
MBHB inspirals can be detected and explored to cosmological distances using space 
GW detectors and PTAs depending on the masses of MBHBs. Although there are 
different merger-tree models and models with BH seeds, they all give significant 
detection rates for space GW detectors and PTAs,710?-194 NGO/eLISA?! and 
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ASTROD-GW.*° PTAs are most sensitive in the frequency range 300 pHz—100 nHz, 
NGO/eLISA space GW detector is most sensitive in the frequency range 2mHz-— 
0.1Hz, while ASTROD-GW is most sensitive in the frequency range 100nHz— 
2mHz (Figs. 4-1). NGO/eLISA and ASTROD-GW will be able to directly observe 
how MBHs form, grow, and interact over the entire history of galaxy formation. 
ASTROD-GW will detect stochastic GW background from MBH binary mergers 
in the frequency range 100nHz to 100 zHz. These observations are significant and 
important to the study of co-evolution of galaxies with MBHs. The expected rate of 
MBHB sources is 10-100 year~! for NGO/eLISA and 10-1000year~! for LISA.?? 
For ASTROD-GW, we are expecting similar number of sources but with better 
angular resolution (Sec. 7.3).4° 

A sample of MBHB merger sources are drawn on Figs. 4-1. The black lines 
show the inspiral, coalescence and oscillation phases of GW emission from various 
equal-mass black-hole binary mergers in circular orbits at various redshift: solid 
line, z = 1; dashed line, z = 5; long-dashed line z = 20. The 10°M,-10° Mo 
MBHB merger at z = 1, 10° Mo-10° Ms MBHB merger at z = 20, and 10*M,- 
10* Mo MBHB merger at z = 5 are from Schutz?°° for Fig. 4; others by scaling; the 
corresponding curves in Figs. 5 and 1 by transformation equation (1). MBHB merger 
events have large signal to noise ratio for space detectors. Some of these events with 
equal mass (from 10?—-10!° Mo) and circular orbit are shown in Figs. 4-1. They are 
all candidates for space-borne detectors. Some could be in earlier phases for future 
ground-based detectors. 

With the detection of MBHB merger events and background, the properties and 
distribution of MBHs could be deduced and underlying population models could 
be tested. 

PTAs have been collecting data for decades for detection of stochastic GW back- 
ground from MBHB mergers. In modeling the MBHB stochastic GW background 
spectra, various authors obtained the following frequency dependence: 


f a 
hel) = Area | | (15) 
with w = —(2/3).” PTAs have improved greatly on the sensitivity for GW detection 
recently.'°6 198 They have put upper limits on the isotropic stochastic background 
assuming the frequency dependence (15) with a = —(2/3) as follows: from European 
PTA (EPTA), Ayear < 3 x 10715; from Parks PTA (PPTA), Ayear < 1x 107! and 
North American Nanohertz Observatory for Gravitational Waves (NANOGray), 
Ayear < 1.5 x 1071°. The three experiments form a robust upper limit of 1 x 107!° 
on Ayear at 95% confidence level ruling out most models of supermassive black hole 
formation. The limit is shown as constraint on the Supermassive Black Hole Binary 
GW Background (SBHB-GWB) in Figs. 2-4 of Ref. 5 as solid line in the frequency 
range 10-°-10~* Hz. The GW energy released from co-evolution with galaxies must 
go somewhere. More energy of GWs might be emitted with higher frequency in 
the hierarchy of supermassive black hole formation. Hence, we have extrapolated 
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this constraint linearly (instead with a knee around f ~ 100nHz in most existed 
models) with dotted line to 10 wHz with some confidence in our review.° We adopt 
the same thing in Figs. 4-1 here. Constraints with other a@ values have similar order 
of magnitudes. 


6.2. Extreme mass ratio inspirals 


EMRIs are GW sources for space GW detectors. The NGO/eLISA sensitive range 
for central MBH masses is 104-10’ Mo. The expected number of NGO/eLISA detec- 
tions over 2 year is 10-20 (Ref. [21]); for LISA, a few tens?!; for ASTROD-GW, 
similar or more with sensitivity toward larger central BH’s and with better angular 
resolution (Sec. 7.3).4° 


6.3. Testing relativistic gravity 


An important scientific goal of LISA®?! and NGO/eLISA?! is to test general 
relativity and to study black hole physics with precision in strong gravity. With 
better precision in 100nHz—1 mHz frequency range, ASTROD is going to push this 
goal further in many aspects. These include testing strong-field gravity, precision 
probing of Kerr spacetime and measuring/constraining the mass of graviton. Some 
considerations have been given in Refs. 109 and 110. Lower frequency sensitivity 
is significant in improving the precision of various tests.109!!° Further studies in 
these respects would be of great value. 


6.4. Dark energy and cosmology 


In the dark energy issue,!'! it is important to determine the value of w in the 


equation of state of dark energy, 


w=, (16) 

p 
as a function of different epochs where p is the pressure and p the density of dark 
energy. For cosmological constant as dark energy, w = —1. From cosmological obser- 


vations, our universe is close to being flat. In a flat Friedman—Lemaitre—Robertson— 
Walker (FLRW) universe, the luminosity distance is given by 


dz(z) = + 2\(H) fo dO, +2 +0pe+27C" 7s, 7) 


where Hp is Hubble constant, Qpg is the present dark energy density parameter, 
and the equation of state of the dark energy w is assumed to be constant. In the 
case of nonconstant w and nonflat FLRW universe, similar but more complicated 
expression can be derived. Here, we show (17) for illustrative purpose. From the 
observed relation of luminosity distance versus redshift z, the parameter w of the 
equation of state as a function of redshift z can be solved for and compared with 
various cosmological models. Dark energy cosmological models can be tested this 
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way. Luminosity distance from supernova observations and from gamma ray burst 
observations versus redshift observations are the focus for the current dark energy 
probes. 

Space GW detectors observing MBHB inspirals and EMRIs are good probes to 
determine the luminosity distances. With the redshift of the source determined by 
the electromagnetic observations of associated galaxies or cluster of galaxies, these 
space GW detectors are also dark energy probes. In the merging of MBHs during 
the galaxy co-evolution processes, gravitational waveforms generated give precise, 
gravitationally calibrated luminosity distances to high redshift. The inspiral signals 
of these binaries can serve as standard candles/sirens.'!?443 With better angular 
resolution (Sec. 7.3), ASTROD-GW will have better chance to identify the associ- 
ated electromagnetic redshift and therefore will be better for the determination of 
the dark energy equation of state.394° 


6.5. Compact binaries 


Space GW detectors are also sensitive to the GWs from Galactic compact. bina- 
ries.°?! These detectors will be able to survey compact stellar-mass binaries and 
study the structure of the Galaxy. NGO/eLISA will detect about 3000 double white 
dwarf binaries individually with most in the GW frequency band 3-6 mHz (orbit 
period about 150-300s); for LISA, about 10,000 double white dwarf binaries.9:?! 
These sources constitute the population which has been proposed as progenitors 
of normal type Ia and peculiar supernovae. For a review on the electromagnetic 
counterparts of GW mergers of compact objects, see, e.g. Ref. 114. At the fre- 
quency band 3-6 mHz, NGO/eLISA is more sensitive than ASTROD-GW (Fig. 4). 
Since NGO/eLISA will be flying first these GW signals will serve as a calibration 
for ASTROD-GW in addition to the verification binaries. The eight verification 
binaries selected by NGO/eLISA are shown on Fig. 4 as red squares with two-year 
integration time (from Ref. 21, p. 14, Fig. 1). 

At GW frequencies below a few mHz, millions of ultra-compact binaries will form 
a detectable foreground for NGO/eLISA and ASTROD-GW. At these frequencies, 
ASTROD-GW is more sensitive than NGO/eLISA (Fig. 4). More sources will be 
resolved individually and ASTROD-GW can improve on the observational results 
of NGO/eLISA. 


6.6. Relic GWs 


For direct detection of primordial (inflationary, relic) GWs in space, one may go to 
frequencies lower or higher than LISA bandwidth,?!!° where there are potentially 
less foreground astrophysical sources!!® to mask detection. DECIGO“* and Big 
Bang Observer*? look for GWs in the higher frequency range while ASTROD- 
GW3?!+5 looks for GWs in the lower frequency range. Their instrument sensitivity 
goals all reach 10~!” in terms of critical density. The main issue is the level of 
foreground and whether foreground could be separated. 
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The straight line in the bottom left corner of Fig. 4 corresponds to Qgw = 107!° 
Cosmic Polarization Background (CMB) upper limit (See, e.g. Ref. 5) of infla- 
tionary GW background. For ASTROD-GW, when a 6-S/C formation is used 
for correlated detection of stochastic GWs, the sensitivity can reach this region. 
However, the anticipated upper limit of MBH-MBH GW background is above the 
3-S/C ASTROD-GW sensitivity. If this background is detected, then the detectabil- 
ity of inflationary GW of the strength Qe, = 10~'°-10~!" from 6-S/C formation 
in the ASTROD-GW frequency region depends on whether this MBH-MBH GW 
‘foreground’ could be separated due to different frequency dependence or other 
signatures.?° 

Other potentially possible GW sources in the relevant frequency band, e.g. cos- 
mic strings, should also be studied. See Ref. 94 for cosmic strings and some other 
sources. 


7. Basic Orbit Configuration, Angular Resolution and 
Multi-Formation Configurations 


In this section, we review and summarize the basic LISA-like and ASTRO-GW 
configurations, their angular resolutions and multi-formation configurations. These 
basic configurations can be used for starting numerical design and numerical orbit 
optimization for missions in these two categories. 


7.1. Basic LISA-like orbit configuration 


As in Fig. 1, the center of mass of the basic LISA-like configuration? 48417 122 


follows a circular orbit of radius R (= 1 AU) around the Sun. Since the distance 
(arm length) L between the spacecraft is much smaller than the circular orbit radius 
1 AU, we could treat the spacecraft orbits as perturbed orbits from the circular 
orbit. The equations for the perturbed orbit are known as Euler—Hill equations, 
Hill equations, or Clohessy and Wiltshire equations. Hill used these equations for 
researches in the lunar theory in the 19th century.!?3 Clohessy and Wiltshire!”4 
derived and used these equations for designing terminal guiding system for satellite 
rendezvous in 1960 after the space era began at 1957. Clohessy and Wiltshire used 
a frame — called CW frame with its origin on the circular reference (center of 
configuration) orbit and with the frame rotating with angular velocity 2 the same 
as that of reference orbit rotation. For the perturbed orbit to keep the same distance 
to the origin and to remain stationary in the CW frame, it is clear by calculating 
the difference of the perturbed orbit and fiducial orbit that the eccentricity e and 
the inclination i with respect to the ecliptic need to be 
Se Gages 18 

ry '~°= Gy’ ca 
to first-order in the perturbation or to O(L/(2R)) [= O(a)]. One way to form a 
nearly triangular configuration with side or arm length L(1+O(qa)) is to require the 


e=3-1/? 
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orbit nodes be separated by 120°, and to choose the true anomalies and arguments 
of perihelion such that each spacecraft at its aphelion is also at its maximum height 
above (north of) the ecliptic (first configuration); the other way is with aphelion 
at its minimum height below (south of) the ecliptic (second configuration).? With 
these choices, the mission configuration plane is at 60° from the ecliptic with the 
intersection to ecliptic tangential to the fiducial (center of configuration) circular 
orbit. For square configuration, just require the orbit nodes to be separated by 
90°; one can similarly construct any regular polygon configuration or any planar 
configuration. The first configuration rotates clockwise; the second rotates counter- 
clockwise. Thus, one reaches the conclusion — in the CW frame there are just two 
planes which make angles of +60° with the reference orbit plane, in which space- 
craft (test particles) obeying the CW equations and perform rigid rotations about 
the origin with angular velocity —Q. 

We follow Dhurandhar et al.,'?? and Wang and Ni** to write down the equation 
for the basic orbits of the three spacecraft for the LISA-like configurations. First, 
the equation of an elliptical orbit in the general X—Y plane is given by 


X=R(coswt+e), Y=RI- gy sin wy, (19) 


where R is the semi-major axis of the ellipse, e the eccentricity and ~ the eccentric 
anomaly. 

Define a to be the ratio of the planned arm length L of the orbit configuration 
to twice radius R (1 AU) of the mean Earth orbit around the Sun, i.e. a = L/(2R). 
Choose the initial time to to be a specific epoch in the Julian calendar and work in 
the Heliocentric Coordinate System (X, Y, Z). X-axis is in the direction of vernal 
equinox. A set of elliptical S/C orbits can be defined as 

X; = R(cosyy+e)cose, Y; = R(1— e?)*/? sind, (36) 
20 
Z; = R(cos wt + e)sine. 


Here, R = 1 AU; e = 0.001925; ¢ = 0.00333. The eccentric anomaly wz is related to 
the mean anomaly Q (t — to) by 


ve + esin Wp = Q(t — to). (21) 


Here, 22 is defined as 27/(one sidereal year). The eccentric anomaly w can be solved 
by numerical iteration. Define q; to be implicitly given by 


Ure + esinw, = Q(t — to) —120°(k — 1), for k = 1, 2,3. (22) 
Define X¢%, Yrr, Zex, (k = 1,2,3) to be 
X¢~ = R(cos wy, + e)cose, 
Yer = R(1 — ?)/? sin pp, (23) 


Zry = R(cos wv, + e)sine. 
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Define yo = Wp — 10° with wp is the position angle of Earth with respect to the 
X-axis at to. Define X¢(k)s Yece)s Zak) (k = 1, 2,3), ie. Xe1); Yea); Ze); X42); 
Yea), 242); X (3), Ye(3), Ze(3) to be 


Xn) = Xtr cos[120°(k — 1) + yo] — Yex sin[120°(k — 1) + vol, 
Yicuy = Xen sin[120°(k — 1) + yo] + Yex cos[120°(k — 1) + Yo], (24) 


Zen) = Zk: 
The basic orbits of the three S/C (for one-body central problem) are 


Rgjo1 = (Xe); Yaa); Zea), 
Rs/ce = (X42), Yara), 242); (25) 
Rgjo3 = (X43); Yas), 24(3))- 


The initial positions can be obtained by choosing t = tg and initial velocities by cal- 
culating the derivatives with respect to time at t = tg. As an example, if we choose 
to = JD2459215.5 (2021-Jan-1st 00:00:00), the initial conditions (states) of three 
spacecraft of eLISA/NGO in J2000.0 solar system barycentric Earth mean equator 
and equinox coordinates are calculated and tabulated in the third column of Table 2 
(from Table 2 of Ref. 84). From these initial conditions, one could start to design and 
optimize the orbit configuration numerically using planetary and lunar ephemeris 
as in Sec. 8.2. For other choice at a different epoch (e.g. at an epoch in 2035 closer 
to eLISA/NGO planned arrival at science orbit), the procedure is the same. 


7.2. Basic ASTROD orbit configuration 


In the original proposal, the ASTROD-GW orbits are chosen in the ecliptic plane 
with inclination A = 0. The angular resolution in the sky has antipodal ambiguity. 
Although over most of sky the resolution is good, near the ecliptic poles the reso- 
lution is poor. After 2010, we have designed the basic orbits of ASTROD-GW to 
have small inclinations in order to resolve these issues while keeping the variation 
of the arm lengths in the tolerable range.?9-4° 

Following Refs. 39 and 40, the basic idea is that if the orbits of the ASTROD- 
GW spacecraft are inclined with a small angle A, the interferometry plane with 
appropriate design is also inclined with similar angle and when the ASTROD-GW 
formation evolves, the interferometry plane can be designed to modulate in the 
ecliptic solar system barycentric frame. With this angular positions of GW sources 
both near the polar region and off the polar region are resolved without antipodal 
ambiguity (see also Sec. 7.3). 

Let us first consider a circular orbit of a spacecraft in the Newtonian gravita- 
tional central problem (one-body central problem) in spherical coordinates (r, 6, y): 


r=a, 6=90°, g=wt+Y0, (26) 
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Table 2. Initial states (conditions) of 3 S/C of eLISA/NGO at epoch JD2459215.5 (2021-Jan-1st 
00:00:00) for our initial choice (third column), after first stage optimization (fourth column) and 
after all optimizations (fifth column) in J2000 equatorial (Earth mean equator and equinox coor- 
dinates) solar system barycentric coordinate system. 


Initial choice Initial states of S/C Initial states of 
of S/C initial after first stage S/C after final 
states optimization optimization 


S/C1 X  —1.53222193865 x 1072 —1.53222193865 x 1072. —1.53221933735 x 1072 
Position Y 9.23347976632 x 1071 9.23347976632 x 10~! 9.23345222988 x 1071 
AU) Z 4.04072005496 x 107! 4.04072005496 x 107! 4.04070800735 x 107! 


S/Cl Ve  —1.71752389145 x 1072. —1.71926502995 x 10-2. —1.71928071373 x 107? 
Velocity Vy  —1.41699055355 x 10-4 = —1.41837311087 x 10-4 —1.41838556464 x 1074 
AU/day) Vz  —6.11987395198 x 1075 —6.12586807155 x 1075 = —6.12592206525 x 107° 


S/C2 X  —1.86344993528 x 1072. —1.86344993528 x 1072. —1.86344993528 x 1072 
Position Y 9.22658604804 x 1071 9.22658604804 x 107! 9.22658604804 x 107! 
AU) Z 3.98334135807 x 1071 3.98334135807 x 107! 3.98334135807 x 1071 


S/C2 Vz  —1.72244923440 x 10-2. —1.72419907995 x 10-2. —1.72419907995 x 10-2 
Velocity Vy —1.88198725403 x 107-4 = —1.88384533079 x 10-4 = —1.88384533079 x 1074 
AU/day) Vz  —2.71845314386 x 10-5 —2.72100311132 x 10-5 —2.72100311132 x 10-5 


S/C3 X  —1.19599845212 x 107? —1.19599845212 x 10-2. —1.19599845212 x 107? 
Position Y 9.22711604030 x 1071 9.22711604030 x 1071 9.22711604030 x 1071 
AU) Z 3.98357113784 x 1071 3.98357113784 x 1071 3.98357113784 x 1071 


S/C3 Ve  —1.72249891952 x 1072. —1.72424881557 x 1072. —1.72424881557 x 107? 
Velocity Vy —9.59855278460 x 10-4 + —9.60776184172 x 10-4 —9.60776184172 x 10-4 
AU/day) Vz  —9.55537821052 x 10-5 —9.56487660660 x 10-5 —9.56487660660 x 10~® 


where a, w, and Y are constants. For spacecraft in this discussion, we have a= 1 AU, 
w = 27/Tp with To = 1 sidereal year, and Yo is the initial phase in the coordinate 
considered. The spacecraft orbit at time ¢ in Cartesian coordinates is 


x=acosp=acos(wt+ yo); y=asing=asin(wt+ yo); z=0. (27) 


Let us transform this orbit actively into an orbit with inclination A, and with the 
intersection of the orbit plane and zyplane (the ecliptic) at the line y = ®po in the 
zy-plane. The active transformation matrix is 

cos? 6) + sin? ®ycosA sin ®ycos®g(1—cosr) sin ®p sinA 


R(A; ®o) = | sin Bp cos ®o(1 — cos A) sin? yp + cos? ®g cosA —cos ®o sin A 


—sin ®p sin A cos ®p sin » cos A 
(28) 
The new spacecraft orbit is 
a! a{1 — sin? ®9(1 — cos A)|cos y + asin ©g cos ®g(1 — cos A)sin y 
y’ | = | acos ®g sin ®g(1 — cos \)cos y + all — cos? ®9(1 — cos A)Jsing |. (29) 


—asin ®g sin A cosy + acos ®p sin A sin yp 
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For the three orbits with inclination » (in radian), we choose: 
S/C I: ®o (I) = 270°, — yo (I) = 0°; 
S/C Il: @(II) = 150°, — go(II) = 120°; (30) 
§/C III: &p (III) = 30°, — yo (IIT) = 240°. 
Defining 
€=1—cosd =05 + 0"), (31) 
from Eqs. (29) and (30), we have 
(i) for the orbit of S/C I 
a! acoswt — €acoswt 
ye |= asinwt ' (32) 


acoswtsin A 


(ii) for the orbit of S/C II 


1 31/2 ; 
a | (-5 Jose — (=) sin 


a gue 1 
- (5) Eg | (7 sine = 5 cos] 
ll 
1/2 
jy l= |e |(—5 )sinwt + (7 )cosu] ’ (33) 
yl 


+ al ag cal gnu > chanced 
2 2 2 
ail? 1 
asin A |) sine — 5 cost] 


(iii) for the orbit of S/C TI 


1 31/2 ; 
a |(-5 )eosur + (= )sin 
a au 1 
+ () gE | (=) sine =< cost 
1/2 
yfll = 14 |(-5 )simwe - (2) cosus| : (34) 
y) 2 
Zz 
31/2 31/2 1 
ae (pau —— |si == t 
( 5 ) as ( 5 )sinwt 5 cos 


i 31/2 : 1 
asin |(— p= )sinat — 5 cos. 
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One can readily check that [(x!)? + (y!)? + (21)?]}/? = [(2¥)? + (y™)? + (24)?]1/2 = 
(eM)? + (yt)? + (2M1)2]1/2 — a hold for consistency. 
Calculate the arm vectors Viz; = pl — ply Virn =f 


ll salle 
3 31/2 31/2 3 
al - (F) cost — (2 )sin we + og| (2 sina + (F) osu] 
3 31/2 3 31/2 
Vor= a[-(5)sinat + (=) cosue| + og] (F)sinwt — (=) cose] ; 
asin A (2) sine — (5) cosu| 
2 2 


MY rl and Vint = 


(35) 
1/2 
31/24 sinwt — (=) a€ sin wt 
= 1/2 
Viren —31/24 coswt + (=) a€ cos wt ee 


—3'/2asinAsinwt 


3 ae gi? 3 
al (5 Jeoswe — (= -)sin we + og| (2 sina — (F) osu] 
3 ave 3 gi? 
Vin= al (5 pine | ( 5 ) cos] + a¢|~(F)sinwt - (A Joos] 


gu? 3 
asin A | (=) sinwe + (5) osu] 
2 2 
(37) 


The closure relation Vypy + Vopeqy + Vin = 0 is checked for verifying calculations 
also. The arm lengths are calculated to be 


Vir] = 31/2 [(1 — €/2)? + sin? Asin? (wt — 60°)| 


1/2 
’ 


Voren| = 3'/2a [(1 — €/2)? + sin? Asin?(wt)]””, (38) 


\Vian| = 3'/2a [(1 — €/2)? + sin” Asin? (wt + 60°)] an 
The fractional arm length variation is within (1/2) sin? \ which is about 1074 for A 
around 1°. 

The cross-product vector N(t) = Vazpqz X Vpeqy is normal to the orbit configu- 
ration plane and has the following components: 


—sin A cos 2wt 
33/2 
N= (=) (1  €/2)02 —sin Asin 2wt |. (39) 
1—€/2 
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The normalized unit normal vector n is then: 


oe —sin A cos 2wt 
n= [sin? A+ (1 — €/2)] 2) _ sin \sin Qut |. (40) 
1-€/2 
The geometric center V,, of the ASTROD-GW spacecraft configuration is 


-(5) Eacoswt 
Yo= (5) €asinwt | at) 


0 


There are three interferometers with two arms in the ASTROD-GW configura- 
tion. The geometric center of each of these three interferometers is at a distance 
of about 0.25 AU from the Sun. Numerical simulation and optimization of orbit 
configuration for inclination of 0.5°, 1°, 1.5°, 2°, 2.5° and 3° have been worked 
out using planetary ephemeris to take into account of the planetary perturbations 
in Ref. 41. The case with inclinations of 1° is reviewed in Sec. 8.3 for illustration. 
When LISA configuration orbits are around the Sun, it is equivalent to multiple 
detector arrays distributed in 1 AU orbit. The extension of ASTROD-GW is already 
of 1 AU. When ASTROD-GW orbits are around the Sun, it is also equivalent to 
multiple detector arrays distributed in 1 AU orbit. 


7.3. Angular resolution 


Consider angular resolution of a coherent GW source. Consider first the LISA case 
as an example. The detector formation of LISA is modulated in its orbit around 
the Sun. The azimuth modulation amplitude is 27 rad with inclination 1.05 rad 
(60°) so that the antenna pattern sweeps around the sky in one year. The antenna 
response is not isotropic but the averaged linear angular resolution (in a year) of 
monochromatic GW sources for LISA differs by less than a factor of three among 
all directions.? This is also true for all LISA-like formations. The angular resolution 
is basically proportional to the inverse of the strain signal to noise ratio. If the 
inclination is of the order 0.017—0.052rad (1-3°) for LISA, the polar resolution 
would be worsened by 30-10 times (approximately the ratio sin of 1.05 rad to sin of 
0.03-0.1rad); the steradian localization in the celestial sphere would be worsened 
by square of this factor. Away from the polar region (0 >> 0.017—0.052rad), the 
steradian localization in the celestial sphere would be by sin? 0. If the signal to noise 
ratio is downgraded by 5 (as in eLISA/NGO or in OMEGA in its low frequency part 
due to shorter arm length), the linear angular resolution is worsened by five times. 
ASTROD-GW has less sensitivity above 1 mHz compared with LISA, therefore the 
angular resolution will be worsened by both factors. In the 100nHz—1 mHz region, 
ASTROD-GW has better sensitivity compared with LISA, in most part by 52 times. 
Hence, the angular resolution in the polar region is similar to that of LISA, while 
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in other regions, the linear resolution is enhanced by roughly 52 x sin@ (upgraded 
by 52 but downgraded by sin? @ in sterad [by sin @ in rad]). Although there is a 
mild dependence on the configuration inclination angle A, within a factor of 3, the 
averaged antenna pattern for ASTROD-GW away from the polar region is better 
by a factor of 52 x sin@ compared to that of LISA. Since the antenna pattern 
of ASTROD-GW sweeps over the whole sky in half year as can been seen from 
Eq. (40), the time of average needed is half a year instead of a year.?9:40 

For more complicated sources like chirping GW sources from binary black holes 
(BBHs), one needs to do fitting in order to obtain the accuracy of the parameters. 
However, the tendency of accuracy of parameters is the same: for similar situations, 
it is proportional to the inverse of the strain signal to noise ratio. 

For Super-ASTROD, the strain signal to noise ratio would be even better than 
ASTROD-GW by five times toward the lower frequency region, therefore the angu- 
lar resolution would be better by five times. For polar resolution, the ASTROD incli- 
nation strategy could be applied. However, since Super-ASTROD has 1 or 2S/C in 
off-ecliptic orbit, this may not be needed. For ASTROD-EM, since the lunar orbit 
is inclined about 5° to the ecliptic and the node precession period is 18.61 tropical 
years, the Earth-Moon Lagrange points also precess together. Depending on the 
time and duration of mission, it might or might not be desirable to use slightly 
inclined orbit.4% 

Most of the Earth orbit GW missions have dipolar ambiguity and the resolution 
is poor in the polar region. However, this is not a big issue since we just need to 
look at both polarity for identification of electromagnetic counterparts and the 
polar region is only a small portion of the sky. 


7.4. Six/twelve spacecraft formation 


In order to detect relic GWs using correlated detection, Big Bang Observer*® and 


DECIGO“ proposals have 12 spacecraft distributed in the Earth orbit in three 
groups separated by 120° in orbit; two groups has three spacecraft each in a LISA- 
like triangular formation and the third group has six spacecraft with two LISA-like 
triangles forming a star configuration (Fig. 6). An alternate configuration is that 
each group has four spacecraft forming a nearly square configuration (also has a 
tilt of 60° with respect to the ecliptic plane). 

For a more sensitive detection of background or relic GWs, correlated detection 
with two sets of triangular ASTROD-GW formation are required, i.e. a 6-S/C 
constellation. The second nearly triangular formation could be put again near L3, 
L4, L5 respectively, but separated from the first formation by 1 x 10°km to 5 x 
10° km for the respective S/C.4° 


8. Orbit Design and Orbit Optimization Using Ephemerides 


Although Sun is dominant in the solar system, there are other planets and celestial 
objects affecting the orbit, notably Jupiter, Venus and Earth. Ephemerides is a must 
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for orbit design. At present, there are three complete fundamental ephemerides of 
the solar system — Development Ephemerides (DE),'2° Ephemerides of Planets 
and Moon (EPM)!?° and Intégrateur Numérique Planétaire de l’Observatoire de 
Paris (INPOP).!2" Any of these three ephemerides could be used in the orbit design 
and orbit optimization. For easier in numerical processing, we normally use Center 
for Gravitation and Cosmology (CGC) ephemeris framework together with initial 
conditions taken from DE ephemerides at a certain epoch for evolving with post- 
Newtonian approximation. 


8.1. CGC ephemeris 


In 1998, we started orbit simulation and parameter determination for 
ASTROD.!?8:!29 We worked out a post-Newtonian ephemeris of the solar system 
including the solar quadrupole moment, the eight planets, the Pluto, the moon and 
the three biggest asteroids. We term this working ephemeris CGC 1 (CGC: Center 
for Gravitation and Cosmology). Using this ephemeris as a deterministic model 
and adding stochastic terms to simulate noise, we generate simulated ranging data 
and use Kalman filtering to determine the accuracies of fitted relativistic and solar 
system parameters for 1050 days of the ASTROD mission. 

For a better evaluation of the accuracy of G/G, we need also to monitor the 
masses of other asteroids. For this, we considered all known 492 asteroids with 
diameter greater than 65 km to obtain an improved ephemeris framework — CGC 2, 
and calculated the perturbations due to these 492 asteroids on the ASTROD space- 
craft. 130131 

In building CGC ephemeris framework, we use the post-Newtonian barycentric 
metric and equations of motion as derived in Brumberg!” 
Newtonian (PPN) parameters 3 and y for solar system bodies (with the gauge 
parameter a set to zero). These equations are used to build our computer-integrated 
ephemeris (with the PPN parameters y = G = 1, Jo = 2 x 107") for eight-planets, 
the Pluto, the Moon and the Sun. The initial positions and initial velocities at the 
epoch 2005-June-10 0:00 are taken from the DE403 ephemeris. The evolution is 
solved by using the fourth-order Runge-Kutta method with the step size h = 0.01 
day. In Ref. 129, the 11-body evolution is extended to 14-body to include the three 
big asteroids — Ceres, Pallas and Vesta (CGC 1 ephemeris). Since the tilt of the 
axis of the solar quadrupole moment to the perpendicular of the elliptical plane is 
small (7°), in CGC 1 ephemeris, we have neglected this tilt. In CGC 2 ephemeris, 
we have added the perturbations of additional 489 asteroids. 

In our first optimization of ASTROD-GW orbits,!%* 18° we have used CGC 2.5 
ephemeris in which only three biggest minor planets are taken into accounts, but 
the Earth’s precession and nutation are added; the solar quadratic zonal harmonic 
and the Earth’s quadratic to quartic zonal harmonic are considered. In later sim- 
ulation, we add the perturbation of additional 349 asteroids and call it CGC 2.7 
ephemeris.34 *° The differences in orbit evolution compared with DE405 for Earth 


with Parametrized Post- 
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for 3700 days starting at JD2461944.0 (2028-Jun-21 12:00:00) are shown in Fig. 5 
of Ref. 40. The differences in radial distances are less than about 200m. The differ- 
ences for other inner planets are smaller. The differences in latitude and longitude 
for Earth are less than 1 mas. 


8.2. Numerical orbit design and orbit optimization for 
eLISA/NGO 


The mission orbit configuration of eLISA/NGO is similar to that of LISA but 
with a shorter arm length and a closer distance to Earth. The distance of any 
two of three spacecraft must be maintained as close as possible during geodetic 
flight. LISA orbit configuration has been studied analytically and numerically in 
various previous works.1!7 122:136.187 For eLISA/NGO, we followed the analytical 
procedure of Dhurandhar et al.!?? (see Sec. 7.1) in making our initial choice of the 
initial conditions in Ref. 84; these initial conditions are listed in column 3 of Table 2 
in Sec 7.1. With this orbit choice, we started numerical orbit design and used the 
CGC ephemeris to numerically optimize the orbit configuration in Ref. 84 as we 
have done for ASTROD-GW orbit®>-86-183 135.138 design. In this section, we review 
and summarize the procedure following [84]. 

The goal of the eLISA/NGO mission orbit optimization is to equalize the three 
arm lengths of the eLISA/NGO formation and to reduce the relative line-of-sight 
velocities between three pairs of spacecraft as much as possible. In the solar system, 
the eLISA/NGO spacecraft orbits are perturbed by the planets. With the initial 
states of the three spacecraft as listed in column three of Table 2, we calculated 
the eLISA/NGO orbit configuration for 1000 days using CGC 2.7. The variations 
of arm lengths and velocities in the line-of-sight direction are drawn in Fig. 2 of 
[84]. The largest variations are caused by Earth, Jupiter and Venus. Our method 
of optimization is to modify the initial velocities and initial heliocentric distances 
so that (i) the perturbed orbital periods for 1000 day average remains close to one 
another, and (ii) the average major axes are adjusted to make arms nearly equal. We 
do this iteratively as follows. From Fig. 2 of Ref. 84, we noticed that the variation 
of Arm1 (between S/C2 and S/C3) is small. First, we adjust the initial conditions 
of S/C2 and S/C3 to make the variation of Arm1 satisfy the mission requirements 
that arm length variations are within 2% and Doppler velocities are within 10 m/s. 
Then, we adjust the initial conditions of S/C1 so that Arm2 and Arm3 satisfy the 
mission requirements. Adjustments are always performed in the ecliptic heliocentric 
coordinate system. 

The actual adjustment procedure is described as follows. Firstly, the magnitudes 
of initial velocities of S/C2 and S/C3 were adjusted so that their average periods 
(367.474 days) in 3 year were a little bit longer than one sidereal year. Within a defi- 
nite range, when the periods become longer, the variations of Arm1 become smaller. 
The initial velocities were adjusted so that the Arm1 satisfied the eLISA/NGO arm 
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length and Doppler velocity requirements. After this, we adjusted the initial veloc- 
ities of S/C1 to make its orbital period approach those of S/C2 and S/C3, and 
Arm2 and Arm3 nearly equal. If the results obtained from the above procedure did 
not satisfy the requirements or better results were expected, we could adjust the 
orbital periods of S/C2 and S/C3 a little bit longer again under the constraint that 
the eLISA/NGO requirements for Arm1 is satisfied. Up to this stage, only initial 
velocities have been adjusted. After we have completed this stage, the initial con- 
ditions of the 3 S/C are listed in column 4 of Table 2; the variations of arm lengths 
and velocities in the line-of-sight direction are drawn in Fig. 3 of Ref. 84. 

After the first stage, we optimized the orbital period of S/C1 by adjusting 
the initial velocity and the semi-major axis until the eLISA/NGO requirements 
were satisfied. The initial conditions of the 3 S/C, after optimization, are listed in 
column 5 of Table 2; the variations of arm lengths (within 2%) and velocities in the 
line-of-sight direction (within 5.5 m/s) are drawn in Fig. 7. In Fig. 7, we also draw 
the angle between S/C and Earth subtended from Sun in 1000 days; it starts at 10° 
behind Earth (in solar orbit) and varies between 9° and 16° with a quasi-period of 
variation about one sidereal year mainly due to Earth’s elliptic motion. 
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Fig. 7. Variations of the arm lengths, the velocities in the line-of-sight direction, and the angle 
between S/C and Earth subtended from Sun in 1000 days for the S/C configuration with initial 
conditions given in column 5 (after final optimization) of Table 2. 
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8.3. Orbit optimization for ASTROD-GW*1 


The goal of the ASTROD-GW mission orbit optimization is to equalize the three 
arm lengths of the ASTROD-GW formation and to reduce the relative line-of- 
sight velocities between three pairs of spacecraft as much as possible. In our first 
optimization, the time of start of the science part of the mission is chosen to 
be noon, June 21, 2025 (JD2460848.0) and the optimization is for a period of 
3700 days using CGC 2.5 ephemeris.!??!%° Since the preparation of the mission 
may take longer time and there is a potential that the extended mission life time 
may be longer than 10 year, in later optimizations,®°°° 138 we started at noon, 
June 21, 2028 (JD2461944.0) and optimize for a period of 20 years using CGC 
2.7 ephemeris including more asteroids than those of CGC 2.5. In both of these 
optimizations, the orbit configuration is set in the ecliptic plane and we have the 
inclination angle \ = 0. With the basic configuration of ASTROD-GW changed 
into an inclined precession orbit formation, we re-design and re-optimize our orbit 
configuration numerically starting at noon, June 21, 2035 (JD2464500.0) for 10 
years for the inclination angle 0.5°, 1°, 1.5°, 2°, 2.5° and 3° using the CGC 2.7.1 
ephemeris.*! 

In this section, we illustrate the design and optimization method with inclined 
precession orbit formation for the case having inclination angle 1° following Ref. 41 
which uses CGC 2.7.1. The differences between CGC 2.7.1 and CGC 2.7 (summa- 
rized in Sec. 8.1) is detailed in Sec. 8.3.1. In Sec. 8.3.2, we review how to obtain the 
initial choice of S/C initial conditions as a starting point for numerical optimiza- 
tion. In Sec. 8.3.3, we discuss method of optimization and summarize the results of 
optimization. 


8.3.1. CGC 2.7.1 ephemeris 


In the CGC 2.7.1 ephemeris framework, we pick up 340 asteroids besides the Ceres, 
Pallas and Vesta from the Lowell database. The masses of 340 asteroids are given 
by Lowell data!®® instead of estimating the masses based on the classification in 
CGC 2.7.83:84:86 The orbit elements of these asteroids are also updated from the 
Lowell database. 

For a 10 year duration starting at June 21, 2035, the differences between the 
Earth’s heliocentric distances calculated by CGC 2.7.1 and DE430 are within 
150m, and that the differences in longitudes and latitudes are within 1.4mas 
and 0.45 mas, respectively. These differences do not affect the results of our TDI 
calculations. 


8.3.2. Initial choice of spacecraft initial conditions 


The R.A. of the Earth at JD2464500 (2035-June-21st 12:00:00) is 17"57™45.098, i.e. 
269.438° from DE 430 ephemeris. The initial positions of the 3S/Cs are obtained 
by choosing the wt as 89.44° for y = wt + yo in Eq. (29). The initial velocities are 
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Fig. 8. S/C1 view from Earth before rotating the initial conditions by an angle (left diagram) 
and after rotating by an angle 2.0°(right diagram) for the case of inclination angle 1.0°.4! 


derived from Eq. (29) by calculating the derivatives with respect to t. The S/C1 
orbit near the Lagrange point L3 is partly obscured by Sun from the line-of-sight of 
Earth (left diagram of Fig. 8). It would obstruct the communication with the Earth 
stations. To avoid the obscuration, we rotate the initial angle ®y and yo forward 
of by 2.0° for inclination angle 1.0°. The S/C1 orbit is shown on the right diagram 
of Fig. 8. The initial choice of initial states for the 3S/Cs in this case is listed in 
column 3 of Table 3. 


8.3.3. Method of optimization 


Our optimization method is to modify the initial velocities and initial heliocentric 
distances to reach the aim of (i) equalizing the three arm lengths of the ASTROD- 
GW formation as much as possible and (ii) reducing the relative Doppler velocities 
between three pairs of spacecraft as much as possible. 

During the actual optimization procedure, we use the following equation to 
modify the average period of the orbit: 


1AT 
Vnew — V prev =F AV & (1 _ i+) V prev: (42) 


For the case of inclination angle of 1°, we calculate the orbits of the 35/C with the 
initial choice of initial conditions listed in column 3 of Table 3 using the CGC 2.7.1 
ephemeris. The average periods of the 35/Cs in 10 years are 365.256 days (S/C1), 
365.267days (S/C2) and 365.266days (S/C3), respectively. We use Eq. (42) to 
change the initial velocities so that the average period of S/C1, S/C2 and S/C3 is 
adjusted to 365.255 days, 365.257 days and 365.257 days, respectively. The initial 
conditions after this step are listed in column 4 of Table 3. In the next step, we use 
the following equations to trim the S/C eccentricities to be nearly circular: 


A 
Ruew = Rorev +AR® (1 ae +) Rorev; 


(43) 


A 
Voew = V prey +AV® (1 —_ =) V prev- 


Here, R is the initial heliocentric distance of spacecraft. The fractional adjustment 
+(AR/R) in Rprey and Vprey would adjust eccentricity without adjust the period 


Table 3. Initial states of S/Cs for the configuration with the inclination angle 1° at epoch JD2464500.0 for initial choice, after 


period optimization, and after all optimizations in J2000 equatorial solar system barycentric coordinate system.4+ 
A= 1.0° Initial choice Initial states of S/Cs after Initial states of S/C after 
of S/C initial states period optimization final optimization 

$/C1 xX —2,8842263289715 x 10—? —2.8842263289715 x 10-2 —2.8842514605546 x 10-2 
Position ¥ 9.1157742309044 x 1071 9.1157742309044 x 1071 9.1158659433458 x 107! 
(AU) Z 3.9552690922456 x 107! 3.9552690922456 x 10~1 3.9553088730467 x 1071 
$/C1 Var —1.7188548244458 x 10-2 —1.7188535691176 x 10-2 —1.7188363750567 x 10~2 
Velocity Vy —2.8220395391983 x 10—4 —2.8220375159556 x 1074 —2.8220098038726 x 10~4 
AU/day) Ve —4,4970276654173 x 10-4 —4.4970243993363 x 10~+ —4.4969796642665 x 10-4 
S/C2 Xx 8.7453598387569 x 10~1 8.7453598387569 x 10~+ 8.7453598387569 x 10—1 
Position ¥ —4,3802677355114 x 1071 —4.3802677355114 x 1071 —4,3802677355114 x 1071 
AU) Z —2.0634980179207 x 10-1 —2.0634980179207 x 10-1 —2.0634980179207 x 10-1 
8/C2 Ver 8.2301784322477 x 10-3 8.2301033726700 x 10~° 8.2301033726700 x 10~3 
Velocity Vy 1.3797379424198 x 107? 1.3797253460590 x 10~? 1.3797253460590 x 10~? 
AU/day) Vz 6.1425805519808 x 10-3 6.1425244722884 x 10-8 6.1425244722884 x 10-3 
$/C3 xX —8.5683596527799 x 10-1 —8.5683596527799 x 10-1 —8.5679330969623 x 10-1 
Position ¥ —4,8998222347472 x 107! —4,8998222347472 x 1071 —4.8995800210059 x 107! 
AU) Zz —1.9592963105165 x 10-4 —1.9592963105165 x 10-1 —1.9591994878015 x 10-4 
$/C3 Va 8.9788714330506 x 10~3 8.9787977300014 x 107% 8.9792464008067 x 10~3 
Velocity Vy —1.3530263187520 x 107? —1.3530152097744 x 107? —1.3530828362023 x 107? 
(AU/day) Vz —5.6998631854817 x 10-3 —5.6998163886731 x 10~% —5.7001012664635 x 10~% 
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of the orbit. The initial conditions after all optimization are listed in column 5 of 
Table 3. 

For the inclination angles 0.0°, 0.5°, 1.5°, 2°, 2.5° and 3°, the optimization 
processes are similar to the inclination 1.0° and the results can be found in Ref. 41. 


9. Deployment of Formation in Earthlike Solar Orbit 


The deployment to orbit around Earth, to halo orbit of Earth-Moon Lagrange 
points and to Sun—Earth L1 and L2 points are well studied. Here, we say a few words 
on the deployment of a spacecraft to different positions of an Earthlike solar orbit. 
A preliminary design of the transfer orbits of the spacecraft from the separations of 
the launch vehicles to the mission orbits near L3, L4 and L5 points has been given 
in Refs. 40 and 140. Let us review this preliminary design first. 

In the mission study of ASTROD I, the ASTROD I S/C is given an appropriate 
delta-V before the last stage of launcher separation in the Low Earth Orbit (LEO) 
and is injected directly to the solar orbit going geodetic to Venus swing-by. We 
can use the same strategy to launch the ASTROD-GW S/C directly into the solar 
transfer orbits near the designated Hohmann transfer orbits or Venus swing-by 
orbit. This way, the only major delta V needed for each S/C to reach the destination 
occurs near the destination to boost the S/C to stay near the destined Lagrange 
point. In row 2-4 of Table 4, we list types of transfer orbits, transfer times, the 
values of solar transfer delta-V and propellant mass ratio for three ASTROD-GW 
S/C. These estimates are good for any other S/C deployed to the same positions. 
The propellant mass ratios are around 0.5—0.55, 0.280 and 0.47 for S/C 1, 2 and 3. 
The total masses in case of ASTROD-GW S/C correspond to a dry mass of 500kg 
are 1111-1266, 723 and 1035kg for 3 S/C respectively (including the propellant 
and the propulsion module with mass of 10% of the propellant). 

For deployment to other location in the solar orbit, we made estimates and 
list them in row 5-7 of Table 4. The baseline is: (i) S/C is propelled by a high 
efficient propulsion module (including the propellant with specific impulse 320s 
and the propulsion module with mass of 10% of the propellant) for large delta-V 
maneuvers and for delivery to the destination; (ii) This module is to be separated 
when the destination state is achieved. 

Further studies on the optimizations of deployment from separation of 
launcher(s) for the orbit configurations with inclinations and for a period of 20 
years are ongoing for both LISA-like missions and ASTROD-GW-like missions. !+4 


10. Time Delay Interferometry 


In Sec. 4, we start discussing TDI, now we continue. To achieve required GW 
sensitivity, TDI to suppress laser frequency noise is required for space GW missions. 

Schematic orbit configuration of LISA-type mission design? and ASTROD-GW 
mission design*! are shown in Figs. 1 and 2, respectively. For the numerical evalua- 
tion, we take a common receiving time epoch for both beams; the results would be 
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Table 4. Estimated delta-V and propellant mass ratio for solar transfer of S/C. 


° ahead of Transfer orbit Transfer Solar transfer delta-V Solar transfer 
Earth in time after injection from propellant mass 
solar orbit LEO to solar transfer ratio (Isp = 320s) 
orbit 
180° Venus flyby 1.3-1.5 year 2.2-2.5km/s 0.50—-0.55 
(near L3) transfer 
60° Inner Hohmann, 1.833 year 1.028 km/s 0.280 
(near L4) 2 Revolutions 
300° (—60°) Outer Hohmann, 1.167 year 2km/s 0.47 
(near L5) 1 Revolutions 
0-60° Inner Hohmann, Less than Less than Less than 
< 2 Revolutions 1.833 year 1.028 km/s 0.280 
60°—300° Venus flyby 1.3-1.5 year 2.2-2.5 km/s 0.50—0.55 
transfer 
300°-360° Outer Hohmann, Less than Less than Less than 
1 Revolutions 1.167 year 2km/s 0.47 


very close to each other numerically if we take the same start time epoch and cal- 
culate the path differences. We refer to the path S/C1l — S/C2 — S/C1 as a (path) 
and the path S/C1 — S/C3 — S/C1 as b (path). Hence, the difference AL between 
Paths 1 and 2 for the unequal-arm Michelson can be denoted as ab — ba = [a, b]. Here 
ab means a path followed by b path. The unequal-arm Michelson is now commonly 
called X-configuration.8”°8 The result of this TDI calculation for ASTROD-GW 
orbit with 1° inclination is shown in Fig. 9. 

The first-generation and second-generation TDIs are proposed since 1999.87-88 
In the first-generation TDIs, static situations are considered. While in the 
second-generation TDIs, motions are compensated to a certain degree. The 
X-configurations considered above belong to the first-generation TDI configura- 
tions. We note that the numerical method has the advantage of taking care of all 
generations into a single calculation format. We shall not review more about these 
developments here, but refer the readers to the excellent review paper by Tinto and 
Dhurandhar®® for a comprehensive treatment. 

We compile for comparison the resulting differences for second-generation two- 
2 and 
b?) due to arm length variations for various mission proposals — eLISA/NGO, 
an NGO-LISA-type mission with a nominal arm length of 2 x 10°km, LISA and 
ASTROD-GW in Table 5. 

We note that: 


arm TDIs with n = 1 and n = 2 (n is the degree of polynomial in ab, ba, a 


(i) All the second-generation TDIs considered for the one-detector case for 
eLISA/NGO, for NGO-LISA-type with 2x 10° km arm length, for LISA and for 
ASTROD-GW with 1° inclination basically satisfy the requirement. (Table 5) 
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Fig. 9. Path length differences between two optical paths of the unequal-arm Michelson TDI 
configuration (X-configuration) for ASTROD-GW orbit formation with 1° inclination. 


(ii) The requirement for unequal arm Michelson (X-configuration) TDI of 
ASTROD-GW needs to be relaxed by about two orders. (Fig. 9 and Table 5). 

(iii) In view of the possibility of a GW mission in Earth orbit, numerical TDI study 
for GW missions in Earth orbit are desired. 

(iv) Experimental demonstration of TDI in laboratory for LISA has been imple- 
mented in 2010.14? eLISA and the original ASTROD-GW TDI requirement are 
based on LISA requirement, and hence also demonstrated. With the present 


Table 5. Comparison of the resulting differences for second-generation TDIs (n = 1 and n = 2) 
due to arm length variations for various mission proposals — eLISA/NGO, an NGO-LISA-type 
mission with a nominal arm length of 2 x 10®km, LISA and ASTROD-GW. 


TDI configuration TDI path difference AL 
eLISA/NGO** NGO-LISA-type LISA®& ASTROD-GW 
with 2 x 10° km (1° inclination)*! 
arm length®4 
Duration 1000 days 1000 days 1000 days 10 years 
n=1 [ab, ba] 1.5 to +1.5 ps 11 to +12 ps —70 to +80 ps —228 to +228 us 
n=2  [a?b?, b?a?] 11 to +12 ps 90 to +100 ps 600 to +650 ps 1813 to +1813 ns 
[abab, baba] 6 to +6 ps 45 to +50 ps 300 to +340ps —907 to +907 ns 
[ab?a, bab] —0.0032 to —0.0036 to —0.015 to —0.66 to +0.66 ns 
+0.0034 ps +0.004 ps +0.013 ps 
Nominal arm length 1Gm (1 Mkm) 2Gm 5Gm 260 Gm 


Requirement on AL = 10 m (30 ns) 20 m (60 ns) 50m (150 ns) 500 m (1500 ns) 
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pace of development in laser technology, the laser frequency noise requirement 
is expected to be able to compensate for 2-3 order of TDI requirement relax- 
ation in 20 years. 

(v) X-configuration TDI sensitivity for GW sources has been studied extensively 
for eLISA.?! It satisfies the present technological requirements well. With 
enhanced laser technology expected, it would also be good for studying the 
ASTROD-GW and various GW missions in Earth orbit. The study for GW 
sensitivity and GW sources for other first-generation and second-generation 
TDIs and for other missions would also be encouraged. 


11. Payload Concept 


GW detection in space basically measures the distance change between two S/C 
(or celestial bodies) as GW comes by. The two S/C (or celestial bodies) must be 
in geodesic motion (or such motion can be deduced). The distance measurement 
must be ultra-sensitive as the GWs are weak. A typical implementation (mission) 
consists of three spacecraft in an almost equilateral triangle formation. The three 
spacecraft range interferometrically with one another. Each spacecraft carries a 
payload of two proof masses, two telescopes, two lasers, a weak light detection and 
handling system, a laser stabilization system, and a drag-free system. For lower 
part of space GW band or for possibly higher precision, a precision/optical clock, 
or an absolute laser stabilization system, and an absolute laser metrology system 
may be used. 

Weak light phase locking and handling: For solar orbit missions, this is impor- 
tant. For ASTROD-GW with a distance of 260Gm (1.73 AU), there is a need to 
phase lock a local laser to 100fW incoming light to amplify and manipulate it. 
For 100fW (\ = 1064nm) weak light, there are 5 x 10° photons/s. This would 
be good for 100kHz frequency tuning. For LISA, 85pW weak light phase locking 
is required. In Tsing Hua University, 2pW weak-light phase-locking with 0.2 mW 
local oscillator has been demonstrated.?%3° In Jet Propulsion Laboratory (JPL), 
Dick et al.8? have achieved offset phase locking to 40fW incoming laser light. It 
would be good for future development focusing on frequency-tracking, modulation— 
demodulation and coding-decoding to make it a mature experimental technique. 
This is also important for the deep space optical communication. 

Drag-free system design and development: Drag-free system consists of a high 
precision accelerometer /inertial sensor to detect non-drag-free motions and a micro- 
thruster system to do the feedback to keep the spacecraft drag-free. LISA Pathfinder 
successfully demonstrated and tested the drag-free technology in the frequency 
range above 100 Hz to satisfy not just the requirement of LISA Pathfinder, but 
also the requirement of LISA.'! The success paved the road of knowledge for all the 
space mission proposals in Table 1. However, for lower part (100nHz—100 wHz) of 
the space frequency band, there needs more work. We have discussed frequency sen- 
sitivity spectrum and reddening factors in Sec. 5. To suppress the reddening factors 
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requires position sensing noise to be flat down to 100nHz and gravity acceleration 
due to spacecraft to be small and modeled to the required level at low frequen- 
cies. The self-gravity-acceleration needs to be stable or subtracted in real-time. An 
absolute laser metrology system to monitor positions of major mass distribution 
in the S/C will be implemented to do this. To completely drop the factor or to go 
beyond, one may need to go to optical sensing and optical feedback control. As to 
the accelerometer/inertial sensor design of ASTROD, an absolute laser metrology 
system is proposed to push the noise down, in particular in the lower frequency 
region. In addition, ASTROD is proposed to monitor the positions of various parts 
of the spacecraft, to facilitate gravitational modeling.?”7° 

Micro-thruster system: For drag-free feedback control, micro-thrusters are 
needed. Field Emission Electric Propulsion (FEEP) system with its high specific 
thrust is a good candidate for the micro-thruster system. The sensitivity of FEEP 
system is good and is in the wN range. The main issue for FEEPs is lifetime. Due 
to technical problems during the development of the FEEP technology, the cold gas 
thrusters have become the alternative choice. The GAIA mission carries cold gas 
thrusters for the attitude and orbit control system (AOCS).14? MICROSCOPE 
and LISA Pathfinder are equipped with cold gas thrusters based on the GAIA 
thrusters. The main disadvantage of cold gas thrusters compared to FEEPs is the 
higher mass per delta-V. The total mission duration is limited by the amount of 
propellant stored in the tanks. Therefore, the FEEP technology would be preferred 
if it is available at a later time. 

Laser system: Nd:YAG nonplanar ring oscillators pumped by laser diodes are 
available with output power of 2 W for use. The frequency noises must be suppressed 
to very low level. The strategy is like the one adopted by NGO/eLISA using pre- 
stabilization, arm locking?! and TDI (Sec. 10). 

Laser frequency standard/Clock: Space optical clocks and optical comb fre- 
quency synthesizer technologies are important in the realization and simplification 
of the GW mission target sensitivity at lower frequency. Another use of the optical 
clock and optical comb frequency synthesizer is to calibrate the optical metrology 
for ASTROD-GW-like missions. This is important for the laser metrology iner- 
tial sensor and for monitoring distances inside spacecraft, to correct local gravity 
changes due to, for example, thermal effects. All these measurements use lasers as 
standard rods. They need to be calibrated using optical frequency standards or 
absolutely stabilized laser frequency standard referenced to an atomic or molecular 
line. The advent of optical clocks and optical combs in space may possibly simplify 
the experimental design of ASTROD-GW-like mission. 

At present, optical clocks in the laboratory!*° have reached a fractional inaccu- 
racy at 10~1® level; and they are improving. Clocks of this accuracy level or better 
can be used for exquisitely sensitive measurements of gravity, motion, and inertial 
navigation. The use of this kind of clocks certainly will facilitate the detection of 
the lower frequency GWs and stimulate the needs of re-design the implementation 
schemes of the lower frequency space GW detection. 
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Absolute laser metrology system: With an ultraprecise laser frequency stan- 
dard/clock, an absolute laser metrology system can be built to monitor the positions 
of various parts of the spacecraft to facilitate gravitational modeling. 

Radiation monitor: A small radiation detector onboard the spacecraft will mon- 
itor test-mass charging of the inertial drag-free sensors. This radiation monitor can 
also be used for measuring Solar Energetic Particles (SEPs) and Galactic Cosmic 
Rays (GCRs) in the area of solar and galactic physics with corresponding applica- 
tions to space weather. !46:147 


12. Outlook 


White dwarf was discovered in 1910 with its density soon estimated. Now we under- 
stand that GWs from white dwarf binaries in our Galaxy form a stochastic GW 
background (“confusion limit” )'4* for space GW detection in general relativity. The 
characteristic strain for confusion limit is about 10~2° in 0.1-1mHz band. As to 
individual sources, some can have characteristic strain around this level for fre- 
quency 1-3 mHz in low-frequency band. One hundred year ago, the sensitivity of 
astrometric observation through the atmosphere around this band is about 1 arcsec. 
This means the strain sensitivity to GW detection is about 10~°; 15 orders away 
from the required sensitivity. 

The first artificial satellite Sputnik was launched in 1957. The technological 
demonstration mission LISA Pathfinder was launched on 3 December 2015. This 
mission successfully tested and demonstrated the drag-free technology to satisfy not 
just the requirement of LISA Pathfinder, but also basically the drag-free require- 
ment of LISA GW space mission concept.!! Thus, the major issue in the techno- 
logical gap of 15 orders of magnitude is successfully abridged during last hundred 
years. The success paved the road for all the space mission proposals (Table 1). 
At present the space GW missions are expected to be launched in two decades. 
Weak-light phase locking is demonstrated in laboratories.?9°°8? Weak-light tech- 
nology still needs developments. And we do anticipate the possibility of an earlier 
launch date for eLISA (or a substitute mission) and possible earlier flight of other 
missions. With the first direct detection of GWs by LIGO and the success of LISA 
Pathfinder mission, the outlook of space detection of GWs is bright. 

The science goals of space GW detectors are the detection of GWs from (i) Mas- 
sive Black Holes; (ii) Extreme-Mass-Ratio Black Hole Inspirals; (iii) Intermediate- 
Mass Black Holes; (iv) Galactic Compact Binaries and (v) Relic GW Background. 
As we can readily see from Figs. 4-1, the signal-to-noise ratios (S/N) for GW 
detection of MBHB mergers are very high, and for the high S/N detection of more 
massive mergers the strain sensitivity at lower part (100nHz—100 wHz) of the space 
detection band is important. For doing this, longer arms have advantages. Longer 
arm missions would be good to compliment PTAs in the exploration of black hole 
co-evolution with galaxies. Longer arm missions with its better angle resolution are 
also more effective in the determination of the equation of state of dark energy, 
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testing relativistic gravity and, possibly, probing the inflationary physics. Efforts 
in minimizing the accelerometer/inertial sensor noise over the MLDC formula or 
beyond will strengthen these goals. Deployment of S/C to any position in the Earth- 
like solar orbit could be less than 1.8 years with propellant mass ratio less than 0.55. 
This is within the practical range of launcher implementation. 

Now, we list important issues for further studies in order to realize and sharpen 
our expectations for GW detection in the frequency range 100 nHz-100 Hz: 


(i) Manipulating weak light. 

(ii) Improvement of low-frequency acceleration noise. 

(iii) Fourier spectrum of perturbations due to celestial bodies in the solar system 
and the precision needed to know the positions of solar system bodies in order 
to separate this spectrum from GW spectrum. 

) Further studies in optimizing deployment delta-V and propellant ratio. 
) Optimizing the inclination angle of the ASTROD-GW-like constellation. 

(vi) Extraction of GW signals based on precise numerical orbits. 

) Further studies in the angular resolution of GW sources. 
) Separation of weak lensing effects from GW signals. 


It is time to think seriously about second-generation space GW detectors — 
BBO, DECIGO, Super-ASTROD and the like. Optical clocks in the laboratory have 
reached a fractional inaccuracy at 10~*® level and their inaccuracy is still improving. 
Clocks of this accuracy level will be developed for space use. This development is 
good for laser pulse ranging scheme for Super-ASTROD. The laser pulse timing 
accuracy of 3 ps is already achieved in T2L2 on board JASON2 satellite.49 0.9 mm 
(3 ps) out of 1300 Gm (8.6 AU) is 7x 10716. At 1 Hz, the characteristic strain would 
reach 7 x 107!* precision. This is comparable to the best of the lower frequency 
strain acceleration noise level in Fig. 5. Pulse timing accuracy is still improving. It 
would be good to study this scheme in more detail. 
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